73
edits
(Add timestamp pitfalls) |
|||
Line 31: | Line 31: | ||
=== Pitfalls === | === Pitfalls === | ||
The [http://wingolab.org/2017/04/byteordermark byte-order mark] showing up at the beginning of the first field name in your file. Excel seems to add this character by default (unless the user tells it not to). As usual, the moral of the story is "Never use Excel". | * The [http://wingolab.org/2017/04/byteordermark byte-order mark] showing up at the beginning of the first field name in your file. Excel seems to add this character by default (unless the user tells it not to). As usual, the moral of the story is "Never use Excel". | ||
* Using a local timestamp instead of a UTC timestamp as a primary key often leads to problems. Because of Daylight Savings Time, one day each year in a series of hourly local timestamps skips an hour and another day has the same local timestamp twice. The [https://www.caktusgroup.com/blog/2019/03/21/coding-time-zones-and-daylight-saving-time/ general] [https://www.jamesridgway.co.uk/why-storing-datetimes-as-utc-isnt-enough/ advice] is to store (and publish) both the UTC timestamp and the local timestamp. We use the UTC timestamp for primary keys and other data operations, but also publish the local timestamp to make it easier for the user to understand the data. | |||
== Testing ETL jobs == | == Testing ETL jobs == |