Kaggle API - The Missing Python Documentation

Official Kaggle API is a command line utility written in Python3, but the documentation only covers command line usage and not Python usage. This post will explain how you can use the API(Version 1.5.6) within Python.

1. Installing Kaggle API

You can run pip install kaggle to install the api. You might need to run pip install –user kaggle on Linux or Mac if you are encountering issues with the installation.

2. Setting up the API Key

Kaggle API requires an API token. Go to the Account Tab (https://www.kaggle.com/<username>/account ) and click ‘Create API Token’. A file named kaggle.json will be downloaded. Move this file in to ~/.kaggle/ folder in Mac and Linux or to C:\Users\<username>\.kaggle\ on windows. This is required for authentication and do not skip this step.

Alternatively, you can populate KAGGLE_USERNAME and KAGGLE_KEY environment variables with values from kaggle.json to get the api to authenticate. Please note that environment variables have precedence over the kaggle.json file and hence setting them incorrectly will result in authentication failure even if you have correct contents in kaggle.json file.

3. Initializing and Authenticating

You can use below lines of code to get an authenticated API instance.

from kaggle.api.kaggle_api_extended import KaggleApi
api = KaggleApi()
api.authenticate()

4. Interacting with competitions

4.1 Searching competitions

# Searching competitions
# Signature: competitions_list(group=None, category=None, sort_by=None, page=1, search=None)
competitions = api.competitions_list(search='cat',category="playground")

# competitions is a list of competition objects.
# iterate though each item to access individual competition
for comp in competitions:
    print(comp.ref,comp.reward,comp.userRank,sep=',')

Most of the list methods in the api have a command line counter part which can be used to display the formatted results. It is not really useful for automation tasks as they dont return anything, but might be useful when exploring the api.

4.2 Listing and downloading competition Files

# List files for a competition
# Signature: competitions_data_list_files(id, **kwargs)
api.competitions_data_list_files('titanic')

# Download all files for a competition
# Signature: competition_download_files(competition, path=None, force=False, quiet=True)
api.competition_download_files('titanic')

# Download single file for a competition
# Signature: competition_download_file(competition, file_name, path=None, force=False, quiet=False)
api.competition_download_file('titanic','gender_submission.csv')

4.3 Submitting to competitions

# Signature: competition_submit(file_name, message, competition, quiet=False)
api.competition_submit('gender_submission.csv','API Submission','titanic')

4.4 Retrieving Leader Board

# Signature: competition_view_leaderboard(id, **kwargs)
leaderboard = api.competition_view_leaderboard('titanic')

5. Interacting with datasets

5.1 Searching datasets

# Signature: dataset_list(sort_by=None, size=None, file_type=None, license_name=None, tag_ids=None, search=None, user=None, mine=False, page=1, max_size=None, min_size=None) 
datasets=api.dataset_list(search='demographics',license_name='cc', file_type='csv')

# datasets is a collection of dataset
for dat in  datasets:
     print(dat.ref,dat.viewCount,dat.voteCount,sep=',')

5.2 Listing dataset files

#Signature: dataset_list_files(dataset)
# dataset string should be in format [owner]/[dataset-name]
api.dataset_list_files('avenn98/world-of-warcraft-demographics').files

5.3 Downloading Files

# Download all files of a dataset
# Signature: dataset_download_files(dataset, path=None, force=False, quiet=True, unzip=False)
api.dataset_download_files('avenn98/world-of-warcraft-demographics')

# download single file
#Signature: dataset_download_file(dataset, file_name, path=None, force=False, quiet=True)
api.dataset_download_file('avenn98/world-of-warcraft-demographics','WoW Demographics.csv')

6. Interacting with Kernels

6.1 Searching Kernels

#Signature: kernels_list(page=1, page_size=20, dataset=None, competition=None, parent_kernel=None, search=None, mine=False, user=None, language=None, kernel_type=None, output_type=None, sort_by=None)
kernels = api.kernels_list(search='titanic')
for kernel in kernels:
     print(kernel.ref,kernel.totalVotes,kernel.language,sep=',')

6.2 Retrieve a kernels output

# Retrieve output for a specified kernel
# Signature: kernels_output(kernel, path, force=False, quiet=True)
api.kernels_output('startupsci/titanic-data-science-solutions',path='.')

6.3 Get the status of the latest kernel run

# Signature: kernels_status(kernel)
api.kernels_status('startupsci/titanic-data-science-solutions')

6.4 Pull a kernel to local machine

# Signature: kernels_pull(kernel, path, metadata=False, quiet=True)
api.kernels_pull('startupsci/titanic-data-science-solutions',path='.')

6.5 Initialize metadata file for a kernel

# Signature: kernels_initialize(folder)
api.kernels_initialize('./demo')

6.6 Pushing a kernel to Kaggle

# Need to have a valid metadata file called 'kernel-metadata.json' in the folder
# Create one using kernels_initialize if you dont have one
# Signature: kernels_push(folder)
api.kernels_push('./demo')

Have any questions? Please add it as comments and I will try my best to answer them.

6 thoughts on “Kaggle API – The Missing Python Documentation”

documentation for Kaggle API *within* python?-ThrowExceptions – ThrowExceptions says:
February 26, 2020 at 11:45 am
[…] published a blog post that explains most of the common use cases of competition, datasets and kernel […]
Amit says:
April 26, 2020 at 10:32 pm
This is a wonderful job. Its been of great help in my project.
mz says:
November 11, 2020 at 3:12 pm
This help full. I have question, if i have previus version exist how to do like api.dataset_list_files(‘avenn98/world-of-warcraft-demographics/version/1’).files , i want to download some data from previus version. Thanks
Greg says:
May 10, 2022 at 1:25 pm
Thanks this is great info. Is there a way to retrieve your score of your most recent competition submission? Retrieving the leaderboard can get you the best score, but not the most recent score.
whoiscall says:
June 13, 2023 at 11:52 pm
Cheers!
MetaMask Download says:
February 6, 2025 at 9:29 am
If you’re new to crypto and need a guide for installing Metamask on Chrome, https://metanate.org/ is the best place to start. It helped me a lot!

Kaggle API – The Missing Python Documentation

ByJose Cherian