cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Announcements
Want to learn some quick and useful tips to make your day easier? Check out how Calvin uses Replay to get feedback from other teams at Dropbox here.

Dropbox API Support & Feedback

Find help with the Dropbox API from other developers.

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

[C#] Large files don't show up in user's cloud after finishing batch upload

[C#] Large files don't show up in user's cloud after finishing batch upload

Andrewer016
Explorer | Level 4
Go to solution

Hey there!

I'm trying to upload large files with batch upload (as it is suggested by Dropbox: https://www.dropbox.com/developers/reference/data-ingress-guide).

Everything seems to be working except when I finish the batch upload with the UploadSessionFinishBatchAsync method, the large files won't show up in the cloud.

For clarity here's the method I use:

public async Task UploadFileBatch(List<string> localPaths)
        {
            //const int ChunkSize = 4096 * 1024;
            const int ChunkSize = 1024 * 1024;
            List<UploadSessionFinishArg> uploadSessionFinishArgs = new List<UploadSessionFinishArg>();
            List<FileStream> openedFileStreams = new List<FileStream>();
            using (var dbx = new DropboxClient("<REDACTED>"))
            {
                for (int i = 0; i < localPaths.Count; i++)
                {
                    string[] localPathBits = localPaths[i].Split('\\');
                    string remotePath = remoteUploadPath;
                    foreach (var bit in localPathBits)
                    {
                        if (!bit.Equals("..") && !bit.Equals("CRUD_tests"))
                        {
                            remotePath += bit + "/";
                        }
                    }
                    remotePath = remotePath.Remove(remotePath.Length - 1);

                    var fileInfo = new FileInfo(localPaths[i]);
                    FileStream fileStream = fileInfo.Open(FileMode.Open, FileAccess.Read, FileShare.Read);
                    openedFileStreams.Add(fileStream);
                    if (fileStream.Length <= ChunkSize)
                    {
                        var offset = (ulong)fileStream.Length;
                        var result = await dbx.Files.UploadSessionStartAsync(true, fileStream);
                        var sessionId = result.SessionId;
                        var cursor = new UploadSessionCursor(sessionId, offset);
                        UploadSessionFinishArg uploadSessionFinishArg = new UploadSessionFinishArg(cursor, new CommitInfo(remotePath, WriteMode.Overwrite.Instance));
                        uploadSessionFinishArgs.Add(uploadSessionFinishArg);
                    }
                    else
                    {
                        string sessionId = null;
                        Console.WriteLine("IN BIG BATCH");
                        byte[] buffer = new byte[ChunkSize];
                        ulong numChunks = (ulong)Math.Ceiling((double)fileStream.Length / ChunkSize);
                        Console.WriteLine("numChunks: " + numChunks);
                        for (ulong idx = 0; idx < numChunks; idx++)
                        {
                            Console.WriteLine("UPLOADING CHUNK #{0}", idx + 1);
                            var byteRead = fileStream.Read(buffer, 0, ChunkSize);

                            using (var memStream = new MemoryStream(buffer, 0, byteRead))
                            {
                                if (idx == 0)
                                {
                                    var result = await dbx.Files.UploadSessionStartAsync(false, memStream);
                                    sessionId = result.SessionId;
                                }
                                else
                                {
                                    Console.WriteLine(localPaths[i] + " : " + sessionId + " : " + (ulong)ChunkSize * idx);
                                    var cursor = new UploadSessionCursor(sessionId, (ulong)ChunkSize * idx);

                                    if (idx == numChunks - 1)
                                    {
                                        await dbx.Files.UploadSessionAppendV2Async(cursor, true, memStream);
                                        cursor = new UploadSessionCursor(sessionId, (ulong)ChunkSize * idx);
                                        UploadSessionFinishArg uploadSessionFinishArg = new UploadSessionFinishArg(cursor, new CommitInfo(remotePath, WriteMode.Overwrite.Instance));
                                        uploadSessionFinishArgs.Add(uploadSessionFinishArg);
                                        Console.WriteLine("FINISHING CHUNK UPLOAD");
                                    }
                                    else
                                    {
                                        await dbx.Files.UploadSessionAppendV2Async(cursor, false, memStream);
                                    }
                                }
                            }
                        }
                    }
                }
                foreach (var arg in uploadSessionFinishArgs)
                {
                    Console.WriteLine(arg.Commit.Path);
                    Console.WriteLine(arg.Cursor.SessionId);
                    Console.WriteLine(arg.Cursor.Offset);
                }
                var batchResult = dbx.Files.UploadSessionFinishBatchAsync(uploadSessionFinishArgs).Result;
                Console.WriteLine("isAsyncJobId: {0} isComplete: {1}, isOther: {2}", batchResult.IsAsyncJobId, batchResult.IsComplete, batchResult.IsOther);
                Console.WriteLine(batchResult.AsAsyncJobId.Value);
                var status = await dbx.Files.UploadSessionFinishBatchCheckAsync(batchResult.AsAsyncJobId.Value);
                    while (status.IsComplete == false)
                {
                    Console.WriteLine("Complete: {0}, inProgress: {1}", status.IsComplete, status.IsInProgress);
                    status = await dbx.Files.UploadSessionFinishBatchCheckAsync(batchResult.AsAsyncJobId.Value);
                }
                Console.WriteLine("Complete: {0}, inProgress: {1}", status.IsComplete, status.IsInProgress);
                foreach (var fileStream in openedFileStreams)
                {
                    fileStream.Dispose();
                }
            }
        }

Basically all I do is, I check if the received file is larger than a given size (now it's 1 Mb) and if it is, I upload it in chunks rather than the whole file in one upload.

Obviously I'm using batch upload to avoid lock contentation.

The thing is that the small files, that are smaller then the ChunkSize valuable, show up fine in the cloud, bat the large files don't, even though everything comes back true.

A run output:

Processed file '..\..\..\..\CRUD_tests\Generated\Files\randomFile0.txt'.
IN BIG BATCH
numChunks: 7
UPLOADING CHUNK #1
UPLOADING CHUNK #2
..\..\..\..\CRUD_tests\Generated\Files\randomFile0.txt : AAAAAAAAASchRyKW64_ouA : 1048576
UPLOADING CHUNK #3
..\..\..\..\CRUD_tests\Generated\Files\randomFile0.txt : AAAAAAAAASchRyKW64_ouA : 2097152
UPLOADING CHUNK #4
..\..\..\..\CRUD_tests\Generated\Files\randomFile0.txt : AAAAAAAAASchRyKW64_ouA : 3145728
UPLOADING CHUNK #5
..\..\..\..\CRUD_tests\Generated\Files\randomFile0.txt : AAAAAAAAASchRyKW64_ouA : 4194304
UPLOADING CHUNK #6
..\..\..\..\CRUD_tests\Generated\Files\randomFile0.txt : AAAAAAAAASchRyKW64_ouA : 5242880
UPLOADING CHUNK #7
..\..\..\..\CRUD_tests\Generated\Files\randomFile0.txt : AAAAAAAAASchRyKW64_ouA : 6291456
FINISHING CHUNK UPLOAD
/UploadTest/Generated/Files/randomFile0.txt
AAAAAAAAASchRyKW64_ouA
6291456
isAsyncJobId: True isComplete: False, isOther: False
dbjid:AABC0shKW6B4Q2jRSr-MN-OgBGvJ7Myfn6AzhwsF-thuOq4kGnOqdO0B-cLK9vUIKkmR5BBZjL4olQ16-hBUMtlD
Complete: False, inProgress: True
Complete: False, inProgress: True
Complete: True, inProgress: False
Uploads finished.

This is just one large file now, but if I'd have 2 smaller files besides it, those would be uploaded and would show up in the cloud. In fact it seems like the large files are successfully uploaded too, yet they aren't.

So anyone has any idea what's the problem here? I ran out of ideas.

Thanks in advance for the help!

Cheers,

Andrew

1 Accepted Solution

Accepted Solutions

Greg-DB
Dropbox Staff
Go to solution

I see that you are checking the 'IsComplete' status of the UploadSessionFinishBatchAsync job using UploadSessionFinishBatchCheckAsync, but that will only tell you if the operation is complete overall. It does not indicate whether any particular commit succeeded or not. You should additionally check each UploadSessionFinishBatchResultEntry (in UploadSessionFinishBatchResult.Entries) to see if each one was a Success or a Failure. Additionally, if it failed, you can see why from the UploadSessionFinishError available in UploadSessionFinishBatchResultEntry.Failure.Value.

By the way, I redacted it from your post, but for the sake of security, you should disable that access token, since you posted it here. You can do so by revoking access to the app entirely, if the access token is for your account, here:

https://www.dropbox.com/account/connected_apps

Or, you can disable just this access token using the API:

HTTP: https://www.dropbox.com/developers/documentation/http/documentation#auth-token-revoke

API Explorer: https://dropbox.github.io/dropbox-api-v2-explorer/#auth_token/revoke

.NET: https://dropbox.github.io/dropbox-sdk-dotnet/html/M_Dropbox_Api_Auth_Routes_AuthUserRoutes_TokenRevo...

View solution in original post

6 Replies 6

Greg-DB
Dropbox Staff
Go to solution

I see that you are checking the 'IsComplete' status of the UploadSessionFinishBatchAsync job using UploadSessionFinishBatchCheckAsync, but that will only tell you if the operation is complete overall. It does not indicate whether any particular commit succeeded or not. You should additionally check each UploadSessionFinishBatchResultEntry (in UploadSessionFinishBatchResult.Entries) to see if each one was a Success or a Failure. Additionally, if it failed, you can see why from the UploadSessionFinishError available in UploadSessionFinishBatchResultEntry.Failure.Value.

By the way, I redacted it from your post, but for the sake of security, you should disable that access token, since you posted it here. You can do so by revoking access to the app entirely, if the access token is for your account, here:

https://www.dropbox.com/account/connected_apps

Or, you can disable just this access token using the API:

HTTP: https://www.dropbox.com/developers/documentation/http/documentation#auth-token-revoke

API Explorer: https://dropbox.github.io/dropbox-api-v2-explorer/#auth_token/revoke

.NET: https://dropbox.github.io/dropbox-sdk-dotnet/html/M_Dropbox_Api_Auth_Routes_AuthUserRoutes_TokenRevo...

Andrewer016
Explorer | Level 4
Go to solution

Thank you Greg!

Turned out that it failed with incorrect offset.

The logic how I calculated the last offset (which was passed to the finishing method) was wrong. But after I repaired it, now everything works just fine.

Also, thanks for erasing the access token, I did forgot about it, but it was just the generated access token that I use during development, so not a big deal (I changed it though).

 

And lastly, I don't know if you could communicate this towards the people at Dropbox who are responsibe for the .NET SDK documentation, but if you could tell them that they should somehow mention at the UploadSessionFinishBatchResult section how to reach it, I'd be grateful. It took me almost an hour to find out how to reach this class, and I only found out because a discussion (about the Dropbox API Java SDK, not even .NET) here in the forum mentioned that one can reach it through the UploadSessionFinishBatchJobStatus object.

 

Thanks again, for your help!

Andrew

Greg-DB
Dropbox Staff
Go to solution

Thanks for following up. I'm glad to hear you got this sorted out.

And thanks for the feedback! I'll send this along to the team.

dotNET_Guy
Explorer | Level 3
Go to solution

Hi Guys,

I've re-used this code also to upload many files in a batch - thanks for posting the code, it's saved me alot of time.

But as expected, I did reproduce the problem where large files appear to be uploaded, but then do not show up in Dropbox.

 

In my test...
1. All the small files that are < 20KB are uploaded successfully
2. Three large files 120MB, 177MB, and 537MB appear to upload over many chunks (viewing debug statements in Visual Studio) but do not exist in the Dropbox folder when completed

 

I'm using the default 4MB chunk size to upload, e.g. so the 120MB file took 30 chunks to upload.


Q1.
As Andrew later discovered, he found the calculation of the last offset to be incorrect. I have reviewed the code a few times and it appears logically correct. Since it isn't correct, can someone please state what is wrong with the last offset caclulation? Thanks.
I'm going to review this code again to see the bug.

 

Q2.
On another issue also mentioned here, I'm currently using the UploadSessionFinishBatchCheckAsync()method and .IsComplete property to check for a successful completion. But as recommend by Greg, I'd like to use the UploadSessionFinishBatchResult() method and .IsSuccess property for each entry, but can't work out how to reach this.
Can you please point me in the correct direction or post a couple of lines of code for this? Thanks

 

Q3.
Is it a good idea to sleep for 500ms (or whatever appropriate value) before each UploadSessionFinishBatchCheckAsync() method .IsComplete property call in a loop?
(await Task.Delay(500);)

 

Thanks.

Greg-DB
Dropbox Staff
Go to solution

@dotNET_Guy 

 

Q1. Perhaps @Andrewer016 would be so kind as to share their updated code with that fixed.

 

Q2. Likewise, perhaps @Andrewer016 can share this piece as well, if they updated their code to include this.

 

Very basically though, it would look like this:

 

foreach (var entry in status.AsComplete.Value.Entries)
{
    if (entry.IsSuccess)
    {
        Console.WriteLine(entry.AsSuccess.Value);
    } else if (entry.IsFailure)
    {
        Console.WriteLine(entry.AsFailure.Value);
    }
}

 

You can likewise drill down further into that 'entry.AsFailure.Value' as needed for more specific error information.

 

Q3. Dropbox doesn't have official guidance or policy on how often you should poll UploadSessionFinishBatchCheckAsync, so that's up to you, but adding a short delay like that does seem reasonable.

dotNET_Guy
Explorer | Level 3
Go to solution

Thanks Greg.

After working on this for most of the day, I think I have the solutions. They appear to be working.

 

1.

The final offset should be at the position of end of the file (i.e. size of the file)

var uploadSessionCursor = new UploadSessionCursor(uploadSessionID, (ulong)fileStream.Length);
await client.Files.UploadSessionAppendV2Async(uploadSessionCursor, true, memoryStream);

 

But too apply this final offset, you will need an extra iteration in the loop, so alter the for loop and if statement (in the final chunk) appropriately:

for (ulong indexOfChunks = 0; indexOfChunks <= numberOfChunks; indexOfChunks++)

if (indexOfChunks == numberOfChunks)

 

 

2. The UploadSessionFinishBatchCheckAsync() method actually returns an object type of UploadSessionFinishBatchJobStatus

var uploadSessionFinishBatchJobStatus = await client.Files.UploadSessionFinishBatchCheckAsync(uploadSessionFinishBatch.AsAsyncJobId.Value);

 And use your aforementioned code.

 

If you wanted to simply check if all jobs in the batch committed successfully, I coded it in 1 line of code using Linq:

result = uploadSessionFinishBatchJobStatus.AsComplete.Value.Entries.Select(j => j.IsSuccess).Count() == uploadSessionFinishBatchJobStatus.AsComplete.Value.Entries.Count();

 

Need more support?
Who's talking

Top contributors to this post

  • User avatar
    dotNET_Guy Explorer | Level 3
  • User avatar
    Greg-DB Dropbox Staff
  • User avatar
    Andrewer016 Explorer | Level 4
What do Dropbox user levels mean?