Command line interface (CLI
)¶
After install Carpyncho you gonna have available command line app to download any dataset.
[1]:
carpyncho --help
Usage: carpyncho command [args...]
Carpyncho console client.
Explore and download the entire https://carpyncho.github.io/ catalogs from your
command line.
Commands:
catalog-info Retrieve the information about a given catalog.
download-catalog Retrives a catalog from th Carpyncho dataset collection.
has-catalog Check if a given catalog and tile exists.
list-catalogs Show the available catalogs for a given tile.
list-tiles Show available tiles.
version Print Carpyncho version.
This software is under the BSD 3-Clause License. Copyright (c) 2020, Juan
Cabral. For bug reporting or other instructions please check:
https://github.com/carpyncho/carpyncho-py
To list all availables tiles we can run
[2]:
carpyncho list-tiles
- b206
- b214
- b216
- b220
- b228
- b234
- b247
- b248
- b261
- b262
- b263
- b264
- b277
- b278
- b356
- b360
- b396
Then we can check all the available catalogs for a given tile (b216
for example)
[3]:
carpyncho list-catalogs b216
Tile b216
- features
- lc
Lets asume we want to download the catalog features from the tile b216. First lets check how big is the catalog before download:
[4]:
carpyncho catalog-info b216 features
Catalog b216-features
- hname: Features
- format: BZIP2-Parquet
- extension: .parquet.bz2
- date: 2020-04-14
- md5sum: 433aae05541a2f5b191aa95d717fa83c features_b216.parquet.bz2
- filename: features_b216.parquet.bz2
- driveid: 1-t165sLjn0k507SFeW-A4p9wYVL9rP4B
- size: 142.2 MiB
- records: 334,773
Well 142 MiB
for 334773
rows in the table, lets download it and sotore it in csv
format
[5]:
carpyncho download-catalog b216 features --out b216_features.csv
b216-features: 149MB [03:03, 811kB/s]
Writing b216_features.csv...
Now lets check the size and the checksum to see if it’s correct (warning this is linux and mac only)
[7]:
cat b216_features.csv | wc -l
334774
The rows are ok, so it’s done.
If you run the same command multiple times, the file will be cached.
All the commands support more options yo can check it with carpyncho <command> --help
. For example
[11]:
carpyncho download-catalog --help
Usage: carpyncho download-catalog [OPTIONS] tile catalog
Retrives a catalog from th Carpyncho dataset collection.
Arguments:
tile The name of the tile.
catalog The name of the catalog.
Options:
--out=STR Path to store the catalog. The extension of the file detemines
the format. Options are ".xlsx" (Excel), ".csv", ".pkl" (Python
pickle) and ".parquet".
--force Force to ignore the cached value and redownload the catalog. Try
to always set force to False.
Other actions:
-h, --help Show the help
[12]:
date
jue abr 23 22:38:42 -03 2020