Tooling

From WPRDC Wiki
Jump to navigation Jump to search

Data tools

  • VisiData - Terminal user interface for a data exploration/manipulation tool that can handle large datasets.
    • If you REALLY don't want to use VisiData because you don't want to use the terminal, Modern CSV looks like a decent CSV editor that can handle large files and won't mangle your CSV file (unlike Excel).
      • If, in a pinch, you absolutely have to use Excel to edit a CSV file (which you really shouldn't), 1) make a copy of your CSV file, saving it as a text file (with the "txt" extension), 2) open it in Excel using the text import function, and 3) make all of the fields text fields. 4) Edit the file in Excel. 5) When finished "Save as..." a CSV file.
  • qsv - "command line program for indexing, slicing, analyzing, splitting, enriching, validating & joining CSV files. Commands are simple, fast and composable." Forked from xsv by CKAN Joel, so it's got some CKAN-specific features in the works.

Anonymization

  • Open Diffix - Free, open-source desktop tool (and eventually Postgres extension) for anonymizing data.