For whatever reason K3d has a low level running traefik pod (resulting is some odd modprobe errors that I don’t even want to try to fix.) So, today I installed the K3s (instead of the k3d which runs in docker). It seems to have no problems with traefik, so going to try to install Rancher on it, but now Rancher website is down so waiting to get to read the logs
(NOTE: Got a call from Suse today.. few days I got TWO calls from databricks.. simply because I’m browsing their tutorials. Please, stop the damn direct marketing calls, its insane!!!)
I thought it would be nice to have a personal persistent storage, so was thinking of buying NAS for that. Seems its going to be bit more complicated, and I might have to setup a nas CSI (container storage interface), storage classes etc., so thats probably going to be one deep rabbit hole again.
As I was planning to start yet another Python project I realized that I’m having way too many venvs scattered everywhere (I usually create a ‘dedicated’ venv in the project dir.) So now Im looking into virtuanenvwrapper to store the venvs in sensible place, and also use same venv in similar projects so I dont have to install pytorch etc everywhere costing my precious SSD space.
Finally, the local rancher is running:
The materialization can be configured as a table with the following configuration block at the top of the model file:
When dbt run is executing, dbt is wrapping the select statement in the correct DDL/DML to build that model as a table/view. If that model already exists in the data warehouse, dbt will automatically drop that table or view before building the new database object. *Note: If you are on BigQuery, you may need to run dbt run –full-refresh for this to take effect.
Naming Conventions In working on this project, we established some conventions for naming our models.
Sources (src) refer to the raw table data that have been built in the warehouse through a loading process. (We will cover configuring Sources in the Sources module) Staging (stg) refers to models that are built directly on top of sources. These have a one-to-one relationship with sources tables. These are used for very light transformations that shape the data into what you want it to be. These models are used to clean and standardize the data before transforming data downstream. Note: These are typically materialized as views. Intermediate (int) refers to any models that exist between final fact and dimension tables. These should be built on staging models rather than directly on sources to leverage the data cleaning that was done in staging. Fact (fct) refers to any data that represents something that occurred or is occurring. Examples include sessions, transactions, orders, stories, votes. These are typically skinny, long tables. Dimension (dim) refers to data that represents a person, place or thing. Examples include customers, products, candidates, buildings, employees. Note: The Fact and Dimension convention is based on previous normalized modeling.
https://github.com/garystafford/kinesis-redshift-streaming-demo