cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Announcements
Want to learn some quick and useful tips to make your day easier? Check out how Calvin uses Replay to get feedback from other teams at Dropbox here.

Dropbox API Support & Feedback

Find help with the Dropbox API from other developers.

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Dropbox API search vs. recursive list_folder with filtered results

Dropbox API search vs. recursive list_folder with filtered results

not_oppenheimer
New member | Level 2
Go to solution

I'm trying to utilize the API to scan my directories for .XLS files and read them. I've attempted this in two ways:

using '/search'

 

url = "https://api.dropboxapi.com/2/files/search"
data = {
    "path": "/",
    "query": ".XLS",
    "start": 0,
    "max_results": 1000,
    "mode": {".tag":"filename"}
}
response = requests.post(url, headers=headers, data=json.dumps(data))
matches, more, start = response.json().values()
paths = [x['metadata']['path_display'] for x in matches]
while response.json()['more']:
    data["start"] = start
    response = requests.post(url, headers=headers, data=json.dumps(data))
    matches, more, start = response.json().values()
    paths.extend([x['metadata']['path_display'] for x in matches])
else:
    return set(filter(lambda x: '.XLS' in x, paths))

and using '/list_folder' and '/list_folder/continue'

data = {"path": "/", "recursive": True}
url = "https://api.dropboxapi.com/2/files/list_folder"
response = requests.post(url, headers=headers, data=json.dumps(data))
entries, cursor, has_more = response.json().values()

paths = [x['path_display'] for x in filter(lambda x: '.XLS' in x['path_display'], entries)]
data  = {
    "cursor": cursor,
}
while has_more:
    response = requests.post(url+'/continue', headers=headers, data=json.dumps(data))
    print(path, len(paths))
    entries, data['cursor'], has_more = response.json().values()
    paths.extend([x['path_display'] for x in filter(lambda x: '.XLS' in x['path_display'], entries)])
else:
    return set(paths)

For some reason that I can't discern based on diffs of the results, the search option is not exhaustive. The majority of the files it's missing are "recent" files in that they were inserted into the directory (or a subfolder) within the last six months, but that doesn't hold for all of the files and doesn't seem to be a known issue with the API. I can much more easily parallelize '/search' and would prefer to use it, but I need to know that it is exhaustive. 

 

1 Accepted Solution

Accepted Solutions

Greg-DB
Dropbox Staff
Go to solution

The results returned by the /2/files/search endpoint are not technically exhaustive. For search queries that have a very large number results, all of them may not be returned. Specifically, there is a max value limit of 9,999 for the 'start' parameter, so if there are more than 10,000 matches, you won't be able to retrieve everything.

For use cases where you have that many entries to retrieve, please use /2/files/list_folder[/continue] instead.

If that doesn't seem to be the issue here though, please open an API ticket with details on the missing search results so we can look into it for you.

View solution in original post

2 Replies 2

Greg-DB
Dropbox Staff
Go to solution

The results returned by the /2/files/search endpoint are not technically exhaustive. For search queries that have a very large number results, all of them may not be returned. Specifically, there is a max value limit of 9,999 for the 'start' parameter, so if there are more than 10,000 matches, you won't be able to retrieve everything.

For use cases where you have that many entries to retrieve, please use /2/files/list_folder[/continue] instead.

If that doesn't seem to be the issue here though, please open an API ticket with details on the missing search results so we can look into it for you.

not_oppenheimer
New member | Level 2
Go to solution

Thanks for your response. You're absolutely correct that I'm butting up against the 9,999 maximum. 

Looking through the documentation there seems to be no built-in method to circumvent or increase that limit. I'll have to be more clever on my end it seem. 

Need more support?
Who's talking

Top contributors to this post

  • User avatar
    not_oppenheimer New member | Level 2
  • User avatar
    Greg-DB Dropbox Staff
What do Dropbox user levels mean?