73
edits
(Add sections on snake case and testing ETL jobs) |
(Add "Getting data" section) |
||
Line 6: | Line 6: | ||
Some WPRDC ETL processes are still in an older framework; once they're all migrated over, it will be possible to extract a catalog of all ETL processes by parsing the job parameters in the files that represent the ETL jobs. | Some WPRDC ETL processes are still in an older framework; once they're all migrated over, it will be possible to extract a catalog of all ETL processes by parsing the job parameters in the files that represent the ETL jobs. | ||
== Getting data == | |||
Some of the sources we get data from: | |||
* FTP servers | |||
** Here's the [https://docs.ipswitch.com/MOVEit/Transfer2019_1/API/Rest/#_getapi_v1_files_id_download-1_0 API documentation for the MOVEit FTP server ] | |||
* APIs | |||
** Google Cloud infrastructure could count as an API | |||
** Some custom-built APIs by individual vendors | |||
* Plain old web sites | |||
== Writing ETL jobs == | == Writing ETL jobs == |