Difference between revisions of "ETL"

Jump to navigation Jump to search
32 bytes added ,  14:26, 22 April 2022
m
Update daylight-savings-time warnings
m (Update daylight-savings-time warnings)
Line 32: Line 32:


* The [http://wingolab.org/2017/04/byteordermark byte-order mark] showing up at the beginning of the first field name in your file. Excel seems to add this character by default (unless the user tells it not to). As usual, the moral of the story is "Never use Excel".
* The [http://wingolab.org/2017/04/byteordermark byte-order mark] showing up at the beginning of the first field name in your file. Excel seems to add this character by default (unless the user tells it not to). As usual, the moral of the story is "Never use Excel".
* Using a local timestamp instead of a UTC timestamp as a primary key often leads to problems. Because of Daylight Savings Time, one day each year in a series of hourly local timestamps skips an hour and another day has the same local timestamp twice. The [https://www.caktusgroup.com/blog/2019/03/21/coding-time-zones-and-daylight-saving-time/ general] [https://www.jamesridgway.co.uk/why-storing-datetimes-as-utc-isnt-enough/ advice] is to store (and publish) both the UTC timestamp and the local timestamp. We use the UTC timestamp for primary keys and other data operations, but also publish the local timestamp to make it easier for the user to understand the data.
* Using a local timestamp instead of a UTC timestamp as a primary key often leads to problems. Because of Daylight Savings Time, one day each year (prior to 2023) in a series of hourly local timestamps skips an hour and another day (prior to 2022) has the same local timestamp twice. The [https://www.caktusgroup.com/blog/2019/03/21/coding-time-zones-and-daylight-saving-time/ general] [https://www.jamesridgway.co.uk/why-storing-datetimes-as-utc-isnt-enough/ advice] is to store (and publish) both the UTC timestamp and the local timestamp. We use the UTC timestamp for primary keys and other data operations, but also publish the local timestamp to make it easier for the user to understand the data.


== Testing ETL jobs ==
== Testing ETL jobs ==

Navigation menu