Command line interface (CLI)

After install Carpyncho you gonna have available command line app to download any dataset.

[1]:
carpyncho --help
Usage: carpyncho command [args...]

Carpyncho console client.

Explore and download the entire https://carpyncho.github.io/ catalogs from your
command line.

Commands:
  catalog-info       Retrieve the information about a given catalog.
  download-catalog   Retrives a catalog from th Carpyncho dataset collection.
  has-catalog        Check if a given catalog and tile exists.
  list-catalogs      Show the available catalogs for a given tile.
  list-tiles         Show available tiles.
  version            Print Carpyncho version.

This software is under the BSD 3-Clause License. Copyright (c) 2020, Juan
Cabral. For bug reporting or other instructions please check:
https://github.com/carpyncho/carpyncho-py

To list all availables tiles we can run

[2]:
carpyncho list-tiles
- b206
- b214
- b216
- b220
- b228
- b234
- b247
- b248
- b261
- b262
- b263
- b264
- b277
- b278
- b356
- b360
- b396

Then we can check all the available catalogs for a given tile (b216 for example)

[3]:
carpyncho list-catalogs b216
Tile b216
    - features
    - lc

Lets asume we want to download the catalog features from the tile b216. First lets check how big is the catalog before download:

[4]:
carpyncho catalog-info b216 features
Catalog b216-features
    - hname: Features
    - format: BZIP2-Parquet
    - extension: .parquet.bz2
    - date: 2020-04-14
    - md5sum: 433aae05541a2f5b191aa95d717fa83c  features_b216.parquet.bz2
    - filename: features_b216.parquet.bz2
    - driveid: 1-t165sLjn0k507SFeW-A4p9wYVL9rP4B
    - size: 142.2 MiB
    - records: 334,773

Well 142 MiB for 334773 rows in the table, lets download it and sotore it in csv format

[5]:
carpyncho download-catalog b216 features --out b216_features.csv
b216-features: 149MB [03:03, 811kB/s]
Writing b216_features.csv...

Now lets check the size and the checksum to see if it’s correct (warning this is linux and mac only)

[7]:
cat b216_features.csv | wc -l
334774

The rows are ok, so it’s done.

If you run the same command multiple times, the file will be cached.

All the commands support more options yo can check it with carpyncho <command> --help. For example

[11]:
carpyncho download-catalog --help
Usage: carpyncho download-catalog [OPTIONS] tile catalog

Retrives a catalog from th Carpyncho dataset collection.

Arguments:
  tile         The name of the tile.
  catalog      The name of the catalog.

Options:
  --out=STR    Path to store the catalog. The extension of the file detemines
               the format. Options are ".xlsx" (Excel), ".csv", ".pkl" (Python
               pickle) and ".parquet".
  --force      Force to ignore the cached value and redownload the catalog. Try
               to always set force to False.

Other actions:
  -h, --help   Show the help
[12]:
date
jue abr 23 22:38:42 -03 2020