cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Announcements
Want to learn some quick and useful tips to make your day easier? Check out how Calvin uses Replay to get feedback from other teams at Dropbox here.

Dropbox API Support & Feedback

Find help with the Dropbox API from other developers.

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Re: python upload big file example

python upload big file example

hsyn
New member | Level 2

Can you please share an example code of how to upload big files (size > 150 mb) with python api v2 sdk?

7 Replies 7

Greg-DB
Dropbox Staff

We don't currently have an official sample app for that, but I'll be sure to pass this along as request for one.

Here's a quick example I put together though: (note, I haven't tested this extensively, and it doesn't have any error handling)

f = open(file_path)
file_size = os.path.getsize(file_path)

CHUNK_SIZE = 4 * 1024 * 1024

if file_size <= CHUNK_SIZE:

    print dbx.files_upload(f, dest_path)

else:

    upload_session_start_result = dbx.files_upload_session_start(f.read(CHUNK_SIZE))
    cursor = dropbox.files.UploadSessionCursor(session_id=upload_session_start_result.session_id,
                                               offset=f.tell())
    commit = dropbox.files.CommitInfo(path=dest_path)

    while f.tell() < file_size:
        if ((file_size - f.tell()) <= CHUNK_SIZE):
            print dbx.files_upload_session_finish(f.read(CHUNK_SIZE),
                                            cursor,
                                            commit)
        else:
            dbx.files_upload_session_append(f.read(CHUNK_SIZE),
                                            cursor.session_id,
                                            cursor.offset)
            cursor.offset = f.tell()

hsyn
New member | Level 2

Thanks Gregory,

It is working without any problem. 

Only thing is in python3 file must be opended in binary format like this:

file = open(file_path,'rb')

Paulo L.
New member | Level 1

Hello All,

Is session_upload supposed to be horrendously slow?

I have written some code, which (honestly) is a bit more convoluted than Gregory's sample, but in essence does exactly the same thing and each call to "files_upload_session_append" is taking "forever" to return.  I even get "connection aborted" here and there... 😞

I am sending in 4M chunks as in Gregory's sample...

Any hints or this is probably my network?  (will test using other connections and post back if anything changes!)

TIA,

Paulo

Greg-DB
Dropbox Staff

Hi Paulo, the majority of the time taken by the files_upload_session_append call should be the time spent actually sending the file content to Dropbox over the network. I can't reproduce the issues you're seeing, so it does seem likely these issues are related to your network connection. 

Also, note that all of the Dropbox servers are located in the US. Your connection speed to Dropbox depends on the routing you get between your ISP and our servers, and may be slower than your ISP's rated speeds.

Sometimes resetting or retrying your connection gets you a different route and better speeds, but that is outside of our control. Some ISPs also throttle sustained connections so if you see an initial high connection speed followed by lower speeds, that could be the reason.

Finally, if you think there may be something interfering with your connection to the Dropbox API (e.g., a firewall, proxy, or other security software) you can try testing your ability to connect to content.dropboxapi.com.

Paulo L.
New member | Level 1

Hi Gregory,

I have made some tests under different connections, and my speed issues do seem to be network-related.

Thanks for the feedback and the sample code; I will use it to optimize mine.

I will publish my results in Github and post back the link here for the convenience of the community.

Thanks again,

Paulo.

barry m.10
New member | Level 1

Hi Gregory,

<<<CHUNK_SIZE = 4 * 1024 * 1024>>>

Is there any reason why you chose 4mb in your python example code? (thanks BTW)

Why chose between a single files_upload call, and a chunked upload session, at 4mb when 150mb is the upper limit for the former? Also why transmit in 4mb blocks when 150mb is permitted? 

Is 4mb preferred for some reason? Is there some efficiency gain? I see that it has been claimed that 4mb is DropBox's chunk size for de-duplication. Is this a factor. or were just intending to minimise resources in your demonstration snippet? 

http://blog.fosketts.net/2011/07/11/dropbox-data-format-deduplication/

Thanks

 

Greg-DB
Dropbox Staff

Hi Barry, there wasn't really any particular reason for 4 MB in my sample. A larger size could certainly improve overall performance (by reducing the overhead of making more connections), but it comes at the cost of making each call more likely to fail. Further, the app would have to re-upload more data for any particular failed call.

So, it's really just a tradeoff for you to make based on the use cases for your app. E.g., if you know your app is likely to be used with weak, slow, or unreliable network connections, a smaller chunk size is better. Otherwise, a larger chunk size would be good.

Need more support?