cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Announcements
Whether you are transferring a single drive, a team or an entire organization, Movebot's cloud migration tool has been built to make your Dropbox migration simple - learn all about it here.

Discuss Dropbox Developer & API

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Searching for a specific file type

Searching for a specific file type

Bin2
New member | Level 2

Hi there,

I'll ask for some patience and forgiveness in advance.  Im about 2 weeks in to Python devleopment, so Im likely missing some obvious approaches - please dont assume a lot of knowledge on my part if you can help.

 

Objective:  I've been tasked with 'crawling' through folders on DropBox via the API to look for certain image types (specific file extensions only - *.dco for reference  - as I wont have knowledge of the file names), and then extracting the path and filename (and then 'do stuff').  Locally I have already completed this code (in other words, if the files are on my computer it works fine) - but now it needs to work in DropBox as well because the data sets will be quite large.  I cannot assume that the files in the folders will be the types I want - hence I need to search for filename extensions.

 

I have access to the DropBox and authorization sorted.  I've created a folder and put in some temp files (which are the pdf's and jpeg's provided by DropBox for testing).  I can query the folder, and return a list of results via files_list_folder.

I can get a list of files, extensions and paths via the code below - however the issue I am having is that I cannot parse the data based on file extension, and the rudementary methods I am using are not working.

 

While I can get a list of files, and even 'copy' them to another list - I cannot find a way to parse the list to give me the path_lower/ the directory and filename - which will give me the extension (ie: find all *.jpg's).  I must be missing something in the manner in which the data is constructed (I understand its instance/object based).  I have been assuming Im not hitting on the correct keyword combinations to extract the data - so Im looking for some help in identifying where Im going wrong.  Thanks in advance!

 

my_client=Dropbox(token)
folderfile_list = my_client.files_list_folder('', True, True)

#this gives me a nice list of items - however I dont seem to be able to *do* anything with it
for item in folderfile_list.entries:
    if isinstance(item, dropbox.files.FileMetadata):
        name = item.name
        fileID= item.id
        fileHash = item.content_hash
        path= metadata.path_lower
        print(name, path)

#This does return the search results I want - but its not iterable - so I dont seem to be able to do anything with it
files_search = my_client.files_search('', '*.pdf')
print(files_search)

type(files_search)
Out[325]: <class 'dropbox.files.SearchResult'>


#this returns nothing
for files in folderfile_list.entries:
    if files.path_lower == '*.jpg':
        print("yes")

#this returns nothing
for item in folderfile_list.entries:
    if entry.path_lower == '*.jpg':
        print("I got it")
    else:
        print("still nothing")

#this also doesnt work import fnmatch pattern ='*.jpg' matching = fnmatch.filter(folderfile_list.entries, pattern) print(matching)

fname = []
for i in folderfile_list.entries:
fname.append(i)
print(fname[1])

import fnmatch
pattern ='*.jpg'
matching = fnmatch.filter(fname, pattern)
print(matching) #this did work - however I cannot find a file TYPE with this - the specific file #name I can find - but not the file extension
#In other words if I change this it *.pdf - it does not get a 'happy' result :( for files in fname: if files.path_lower == '/test folder/strategy-session-hotel.pdf': print("happy") print(files.path_lower) else: print('unhappy')

 

 

 

2 Replies 2

Re: Searching for a specific file type

Bin2
New member | Level 2

I believe I have solved my own problem - incase anyone else needs it.  Its not pretty - but it works.

 

spot=[]
holder=[]
holder=dbx.files_list_folder('/Test Folder')
print(holder)
for files in holder.entries:
  spot.append(files.path_lower)
 
print(spot)

pattern = '*.jpg'
matching = fnmatch.filter(spot, pattern)
print(matching)

['/test folder/az-car-rental.jpg', '/test folder/il-car-rental.jpg', '/test folder/car-rental-invoice.jpg', '/test folder/dinner-receipt.jpg', '/test folder/lunch-receipt.jpg', '/test folder/meal-receipt.jpg', '/test folder/meetup-dinner.jpg', '/test folder/team-offsite-lunch.jpg', '/test folder/training-airfare.jpg', '/test folder/training-hotel-invoice.jpg', '/test folder/travel-meal.jpg']

Re: Searching for a specific file type

Greg-DB
Dropboxer

I'm glad to hear you already got this working. You have the right idea in that you can call files_list_folder to list the contents of a folder, and then check the Metadata.path_lower (or Metadata.name) for the returned entries to see if the file extension is one you're interested in.

 

Note though that you should also implement files_list_folder_continue to make sure you can receive all of the entries. Check out the files_list_folder documentation for more information.

 

Also, one alternative for your file extension check may be to use the 'endswith' method like this:

files.path_lower.endswith(".jpg")
Who's talking

Top contributors to this post

  • User avatar
    Greg-DB Dropboxer
  • User avatar
    Bin2 New member | Level 2
What do Dropbox user levels mean?
Need more support?