Official Kaggle API is a command line utility written in Python3, but the documentation only covers command line usage and not Python usage. This post will explain how you can use the API(Version 1.5.6) within Python.1. Installing Kaggle APIYou can run pip install kaggle to install the api. You might need to run pip install –user kaggle on Linux or Mac if you are encountering issues with the installation.2. Setting up the API KeyKaggle API requires an API token. Go to the Account Tab (https://www.kaggle.com/<username>/account ) and click ‘Create API Token’. A file named kaggle.json will be downloaded. Move this file in to ~/.kaggle/ folder in Mac and Linux or to C:\Users\<username>\.kaggle\ on windows. This is required for authentication and do not skip this step.Alternatively, you can populate KAGGLE_USERNAME and KAGGLE_KEY environment variables with values from kaggle.json to get the api to authenticate. Please note that environment variables have precedence over the kaggle.json file and hence setting them incorrectly will result in authentication failure even if you have correct contents in kaggle.json file.3. Initializing and AuthenticatingYou can use below lines of code to get an authenticated API instance.from kaggle.api.kaggle_api_extended import KaggleApi api = KaggleApi() api.authenticate()4. Interacting with competitions4.1 Searching competitions# Searching competitions # Signature: competitions_list(group=None, category=None, sort_by=None, page=1, search=None) competitions = api.competitions_list(search='cat',category="playground") # competitions is a list of competition objects. # iterate though each item to access individual competition for comp in competitions: print(comp.ref,comp.reward,comp.userRank,sep=',') Most of the list methods in the api have a command line counter part which can be used to display the formatted results. It is not really useful for automation tasks as they dont return anything, but might be useful when exploring the api. competitions_list_cli4.2 Listing and downloading competition Files# List files for a competition # Signature: competitions_data_list_files(id, **kwargs) api.competitions_data_list_files('titanic') # Download all files for a competition # Signature: competition_download_files(competition, path=None, force=False, quiet=True) api.competition_download_files('titanic') # Download single file for a competition # Signature: competition_download_file(competition, file_name, path=None, force=False, quiet=False) api.competition_download_file('titanic','gender_submission.csv')4.3 Submitting to competitions# Signature: competition_submit(file_name, message, competition, quiet=False) api.competition_submit('gender_submission.csv','API Submission','titanic')4.4 Retrieving Leader Board# Signature: competition_view_leaderboard(id, **kwargs) leaderboard = api.competition_view_leaderboard('titanic')5. Interacting with datasets5.1 Searching datasets# Signature: dataset_list(sort_by=None, size=None, file_type=None, license_name=None, tag_ids=None, search=None, user=None, mine=False, page=1, max_size=None, min_size=None) datasets=api.dataset_list(search='demographics',license_name='cc', file_type='csv') # datasets is a collection of dataset for dat in datasets: print(dat.ref,dat.viewCount,dat.voteCount,sep=',')5.2 Listing dataset files#Signature: dataset_list_files(dataset) # dataset string should be in format [owner]/[dataset-name] api.dataset_list_files('avenn98/world-of-warcraft-demographics').files5.3 Downloading Files# Download all files of a dataset # Signature: dataset_download_files(dataset, path=None, force=False, quiet=True, unzip=False) api.dataset_download_files('avenn98/world-of-warcraft-demographics') # download single file #Signature: dataset_download_file(dataset, file_name, path=None, force=False, quiet=True) api.dataset_download_file('avenn98/world-of-warcraft-demographics','WoW Demographics.csv')6. Interacting with Kernels6.1 Searching Kernels#Signature: kernels_list(page=1, page_size=20, dataset=None, competition=None, parent_kernel=None, search=None, mine=False, user=None, language=None, kernel_type=None, output_type=None, sort_by=None) kernels = api.kernels_list(search='titanic') for kernel in kernels: print(kernel.ref,kernel.totalVotes,kernel.language,sep=',')6.2 Retrieve a kernels output# Retrieve output for a specified kernel # Signature: kernels_output(kernel, path, force=False, quiet=True) api.kernels_output('startupsci/titanic-data-science-solutions',path='.')6.3 Get the status of the latest kernel run# Signature: kernels_status(kernel) api.kernels_status('startupsci/titanic-data-science-solutions') 6.4 Pull a kernel to local machine# Signature: kernels_pull(kernel, path, metadata=False, quiet=True) api.kernels_pull('startupsci/titanic-data-science-solutions',path='.')6.5 Initialize metadata file for a kernel# Signature: kernels_initialize(folder) api.kernels_initialize('./demo')6.6 Pushing a kernel to Kaggle# Need to have a valid metadata file called 'kernel-metadata.json' in the folder # Create one using kernels_initialize if you dont have one # Signature: kernels_push(folder) api.kernels_push('./demo') Have any questions? Please add it as comments and I will try my best to answer them.
[…] published a blog post that explains most of the common use cases of competition, datasets and kernel […]Reply
This help full. I have question, if i have previus version exist how to do like api.dataset_list_files(‘avenn98/world-of-warcraft-demographics/version/1’).files , i want to download some data from previus version. ThanksReply
Thanks this is great info. Is there a way to retrieve your score of your most recent competition submission? Retrieving the leaderboard can get you the best score, but not the most recent score.Reply