[jira] Created: (FILEUPLOAD-149) Intermittent file corruption on upload

View: New views
9 Messages — Rating Filter:   Alert me  

[jira] Created: (FILEUPLOAD-149) Intermittent file corruption on upload

by JIRA jira@apache.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Intermittent file corruption on upload
--------------------------------------

                 Key: FILEUPLOAD-149
                 URL: https://issues.apache.org/jira/browse/FILEUPLOAD-149
             Project: Commons FileUpload
          Issue Type: Bug
    Affects Versions: 1.2
         Environment: Linux (CentOS) server; all client platforms and browsers
            Reporter: F. Andy Seidl
            Priority: Critical


I have been struggling for several weeks trying to track down the root cause
of a sporadic file corruption problem using File Upload.  I'm really stumped
at this point and welcome any suggestions as to avenues of debugging
pursuit.

Here's overview of the problem:

I have eight Linux (CentOS) servers all running the same web application--a
set of Java servlets using Resin as a servlet runner under Apache.  All
servers were configured using the same script that installs jars,
config files, etc.

On three of the servers, File Upload works reliably.  On five of the
servers, File Upload usually (but not always) leaves me with a corrupted file.

Corrupted files are always the correct length but contain a relatively small
percentage of scrambled bytes.  I looked for obvious patterns like newlines
being altered or high-bit bytes being converted to or from UTF-8, but there
is no obvious (to me, anyway) pattern in the failure.

I have also tested with IE, FireFox, and Safari on both Windows and MacOS.
The issue appears to be independent of client browser and OS.

Looking at Java system properties, I notice that while the classpath and
bootclasspath have the same jars lists on all servers, they are not listed
in the same order (probably listed in directory order as the paths are
constructed by a script that inspects lib directories.)

Is anyone aware of classpath order dependencies that could break File
Upload?

Can anyone offer any suggestions about what *might* be breaking File Upload?
Or what other questions I should be asking?  At the moment, I'm feeling
rather stumped.


--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (FILEUPLOAD-149) Intermittent file corruption on upload

by JIRA jira@apache.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


    [ https://issues.apache.org/jira/browse/FILEUPLOAD-149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12537358 ]

F. Andy Seidl commented on FILEUPLOAD-149:
------------------------------------------

I had placed this question to the mail list back in August.  Here is one message that includes some additional details as well as some test results.

-----Original Message-----
From: F. Andy Seidl [mailto:faseidl@...]
Sent: Wednesday, August 22, 2007 11:25 AM
To: user@...
Subject: FW: File Upload Data Corruption Problem

>From: Tahir Akhtar
>
>Do the files that are corrupted ends up same at each upload or there is
>some randomness in that too?

[fas] There is randomness in how each file is corrupted.  See test results
below.

>Could it be something outside the file upload component, like disk
>problem or another process corrupting the file after it has been written
>on the server?

[fas] This seems unlikely.  This problem occurs on four different servers,
none of which are having any other signs of trouble.  As part of the tests
below, I uploaded a file and then (from a command line session) immediately
renamed the file.

[snip]

TEST RESULTS
------------

I uploaded a file, pincney_map.pdf, four times.  Immediately after each
upload, I renamed the file to 1.pdf, 2.pdf, etc.

Diff reports that each of the uploaded files differ from all the others.

I then downloaded the four uploaded pdf files (using WinSCP, which has
proven reliable for binary file transfer) to a Windows machine in order to
use the "comp" utility to determine where the first difference was between
each file using this simple script:

@echo off

comp pinckney_map.pdf 1.pdf
comp pinckney_map.pdf 2.pdf
comp pinckney_map.pdf 3.pdf
comp pinckney_map.pdf 4.pdf

comp 1.pdf 2.pdf
comp 1.pdf 3.pdf
comp 1.pdf 4.pdf
comp 2.pdf 3.pdf
comp 2.pdf 4.pdf
comp 3.pdf 4.pd

The output of that script follows:

Comparing pinckney_map.pdf and 1.pdf...
Compare error at OFFSET 1946
file1 = 1
file2 = 54
Compare error at OFFSET 1947
file1 = 17
file2 = 61
Compare error at OFFSET 1948
file1 = 4
file2 = D3
Compare error at OFFSET 1949
file1 = 40
file2 = 10
Compare error at OFFSET 194A
file1 = 58
file2 = 54
Compare error at OFFSET 194B
file1 = AA
file2 = 55
Compare error at OFFSET 194C
file1 = EB
file2 = BD
Compare error at OFFSET 194D
file1 = 3
file2 = 7D
Compare error at OFFSET 194E
file1 = B
file2 = 35
Compare error at OFFSET 194F
file1 = C3
file2 = 33
10 mismatches - ending compare

Comparing pinckney_map.pdf and 2.pdf...
Compare error at OFFSET 194C
file1 = EB
file2 = 55
Compare error at OFFSET 194D
file1 = 3
file2 = BD
Compare error at OFFSET 194E
file1 = B
file2 = 7D
Compare error at OFFSET 194F
file1 = C3
file2 = 35
Compare error at OFFSET 1950
file1 = 5C
file2 = 33
Compare error at OFFSET 1951
file1 = 15
file2 = 78
Compare error at OFFSET 1952
file1 = B4
file2 = 93
Compare error at OFFSET 1953
file1 = 77
file2 = 60
Compare error at OFFSET 1954
file1 = F
file2 = 16
Compare error at OFFSET 1955
file1 = 3
file2 = 99
10 mismatches - ending compare

Comparing pinckney_map.pdf and 3.pdf...
Compare error at OFFSET 1940
file1 = 7D
file2 = 20
Compare error at OFFSET 1941
file1 = D9
file2 = 35
Compare error at OFFSET 1942
file1 = 38
file2 = 59
Compare error at OFFSET 1943
file1 = 6D
file2 = 61
Compare error at OFFSET 1944
file1 = 78
file2 = 2C
Compare error at OFFSET 1945
file1 = EA
file2 = 54
Compare error at OFFSET 1946
file1 = 1
file2 = 61
Compare error at OFFSET 1947
file1 = 17
file2 = D3
Compare error at OFFSET 1948
file1 = 4
file2 = 10
Compare error at OFFSET 1949
file1 = 40
file2 = 54
10 mismatches - ending compare

Comparing pinckney_map.pdf and 4.pdf...
Compare error at OFFSET 1946
file1 = 1
file2 = 54
Compare error at OFFSET 1947
file1 = 17
file2 = 61
Compare error at OFFSET 1948
file1 = 4
file2 = D3
Compare error at OFFSET 1949
file1 = 40
file2 = 10
Compare error at OFFSET 194A
file1 = 58
file2 = 54
Compare error at OFFSET 194B
file1 = AA
file2 = 55
Compare error at OFFSET 194C
file1 = EB
file2 = BD
Compare error at OFFSET 194D
file1 = 3
file2 = 7D
Compare error at OFFSET 194E
file1 = B
file2 = 35
Compare error at OFFSET 194F
file1 = C3
file2 = 33
10 mismatches - ending compare

Comparing 1.pdf and 2.pdf...
Compare error at OFFSET 1946
file1 = 54
file2 = 1
Compare error at OFFSET 1947
file1 = 61
file2 = 17
Compare error at OFFSET 1948
file1 = D3
file2 = 4
Compare error at OFFSET 1949
file1 = 10
file2 = 40
Compare error at OFFSET 194A
file1 = 54
file2 = 58
Compare error at OFFSET 194B
file1 = 55
file2 = AA
Compare error at OFFSET 194C
file1 = BD
file2 = 55
Compare error at OFFSET 194D
file1 = 7D
file2 = BD
Compare error at OFFSET 194E
file1 = 35
file2 = 7D
Compare error at OFFSET 194F
file1 = 33
file2 = 35
10 mismatches - ending compare

Comparing 1.pdf and 3.pdf...
Compare error at OFFSET 1940
file1 = 7D
file2 = 20
Compare error at OFFSET 1941
file1 = D9
file2 = 35
Compare error at OFFSET 1942
file1 = 38
file2 = 59
Compare error at OFFSET 1943
file1 = 6D
file2 = 61
Compare error at OFFSET 1944
file1 = 78
file2 = 2C
Compare error at OFFSET 1945
file1 = EA
file2 = 54
Compare error at OFFSET 1946
file1 = 54
file2 = 61
Compare error at OFFSET 1947
file1 = 61
file2 = D3
Compare error at OFFSET 1948
file1 = D3
file2 = 10
Compare error at OFFSET 1949
file1 = 10
file2 = 54
10 mismatches - ending compare

Comparing 1.pdf and 4.pdf...
Compare error at OFFSET 16CA6
file1 = 25
file2 = C3
Compare error at OFFSET 16CA7
file1 = 44
file2 = D1
Compare error at OFFSET 16CA8
file1 = B5
file2 = 8E
Compare error at OFFSET 16CA9
file1 = B6
file2 = 20
Compare error at OFFSET 16CAA
file1 = 9D
file2 = F2
Compare error at OFFSET 16CAB
file1 = C5
file2 = 6B
Compare error at OFFSET 16CAC
file1 = BA
file2 = 63
Compare error at OFFSET 16CAD
file1 = C5
file2 = ED
Compare error at OFFSET 16CAE
file1 = CD
file2 = 98
Compare error at OFFSET 16CAF
file1 = EE
file2 = 12
10 mismatches - ending compare

Comparing 2.pdf and 3.pdf...
Compare error at OFFSET 1940
file1 = 7D
file2 = 20
Compare error at OFFSET 1941
file1 = D9
file2 = 35
Compare error at OFFSET 1942
file1 = 38
file2 = 59
Compare error at OFFSET 1943
file1 = 6D
file2 = 61
Compare error at OFFSET 1944
file1 = 78
file2 = 2C
Compare error at OFFSET 1945
file1 = EA
file2 = 54
Compare error at OFFSET 1946
file1 = 1
file2 = 61
Compare error at OFFSET 1947
file1 = 17
file2 = D3
Compare error at OFFSET 1948
file1 = 4
file2 = 10
Compare error at OFFSET 1949
file1 = 40
file2 = 54
10 mismatches - ending compare

Comparing 2.pdf and 4.pdf...
Compare error at OFFSET 1946
file1 = 1
file2 = 54
Compare error at OFFSET 1947
file1 = 17
file2 = 61
Compare error at OFFSET 1948
file1 = 4
file2 = D3
Compare error at OFFSET 1949
file1 = 40
file2 = 10
Compare error at OFFSET 194A
file1 = 58
file2 = 54
Compare error at OFFSET 194B
file1 = AA
file2 = 55
Compare error at OFFSET 194C
file1 = 55
file2 = BD
Compare error at OFFSET 194D
file1 = BD
file2 = 7D
Compare error at OFFSET 194E
file1 = 7D
file2 = 35
Compare error at OFFSET 194F
file1 = 35
file2 = 33
10 mismatches - ending compare

Comparing 3.pdf and 4.pdf...
Compare error at OFFSET 1940
file1 = 20
file2 = 7D
Compare error at OFFSET 1941
file1 = 35
file2 = D9
Compare error at OFFSET 1942
file1 = 59
file2 = 38
Compare error at OFFSET 1943
file1 = 61
file2 = 6D
Compare error at OFFSET 1944
file1 = 2C
file2 = 78
Compare error at OFFSET 1945
file1 = 54
file2 = EA
Compare error at OFFSET 1946
file1 = 61
file2 = 54
Compare error at OFFSET 1947
file1 = D3
file2 = 61
Compare error at OFFSET 1948
file1 = 10
file2 = D3
Compare error at OFFSET 1949
file1 = 54
file2 = 10
10 mismatches - ending compare

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@...
For additional commands, e-mail: user-help@...

> Intermittent file corruption on upload
> --------------------------------------
>
>                 Key: FILEUPLOAD-149
>                 URL: https://issues.apache.org/jira/browse/FILEUPLOAD-149
>             Project: Commons FileUpload
>          Issue Type: Bug
>    Affects Versions: 1.2
>         Environment: Linux (CentOS) server; all client platforms and browsers
>            Reporter: F. Andy Seidl
>            Priority: Critical
>
> I have been struggling for several weeks trying to track down the root cause
> of a sporadic file corruption problem using File Upload.  I'm really stumped
> at this point and welcome any suggestions as to avenues of debugging
> pursuit.
> Here's overview of the problem:
> I have eight Linux (CentOS) servers all running the same web application--a
> set of Java servlets using Resin as a servlet runner under Apache.  All
> servers were configured using the same script that installs jars,
> config files, etc.
> On three of the servers, File Upload works reliably.  On five of the
> servers, File Upload usually (but not always) leaves me with a corrupted file.
> Corrupted files are always the correct length but contain a relatively small
> percentage of scrambled bytes.  I looked for obvious patterns like newlines
> being altered or high-bit bytes being converted to or from UTF-8, but there
> is no obvious (to me, anyway) pattern in the failure.
> I have also tested with IE, FireFox, and Safari on both Windows and MacOS.
> The issue appears to be independent of client browser and OS.
> Looking at Java system properties, I notice that while the classpath and
> bootclasspath have the same jars lists on all servers, they are not listed
> in the same order (probably listed in directory order as the paths are
> constructed by a script that inspects lib directories.)
> Is anyone aware of classpath order dependencies that could break File
> Upload?
> Can anyone offer any suggestions about what *might* be breaking File Upload?
> Or what other questions I should be asking?  At the moment, I'm feeling
> rather stumped.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (FILEUPLOAD-149) Intermittent file corruption on upload

by JIRA jira@apache.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


    [ https://issues.apache.org/jira/browse/FILEUPLOAD-149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12538225 ]

Jochen Wiedmann commented on FILEUPLOAD-149:
--------------------------------------------

The only confirmed problem with the current release is FILEUPLOAD-135. There is also FILEUPLOAD-144, which I suspect to be a duplicate. Please try to use a current snapshot, as available from

    http://people.apache.org/repo/m2-snapshot-repository/org/apache/commons/fileupload/commons-fileupload/1.3-SNAPSHOT/


> Intermittent file corruption on upload
> --------------------------------------
>
>                 Key: FILEUPLOAD-149
>                 URL: https://issues.apache.org/jira/browse/FILEUPLOAD-149
>             Project: Commons FileUpload
>          Issue Type: Bug
>    Affects Versions: 1.2
>         Environment: Linux (CentOS) server; all client platforms and browsers
>            Reporter: F. Andy Seidl
>            Priority: Critical
>
> I have been struggling for several weeks trying to track down the root cause
> of a sporadic file corruption problem using File Upload.  I'm really stumped
> at this point and welcome any suggestions as to avenues of debugging
> pursuit.
> Here's overview of the problem:
> I have eight Linux (CentOS) servers all running the same web application--a
> set of Java servlets using Resin as a servlet runner under Apache.  All
> servers were configured using the same script that installs jars,
> config files, etc.
> On three of the servers, File Upload works reliably.  On five of the
> servers, File Upload usually (but not always) leaves me with a corrupted file.
> Corrupted files are always the correct length but contain a relatively small
> percentage of scrambled bytes.  I looked for obvious patterns like newlines
> being altered or high-bit bytes being converted to or from UTF-8, but there
> is no obvious (to me, anyway) pattern in the failure.
> I have also tested with IE, FireFox, and Safari on both Windows and MacOS.
> The issue appears to be independent of client browser and OS.
> Looking at Java system properties, I notice that while the classpath and
> bootclasspath have the same jars lists on all servers, they are not listed
> in the same order (probably listed in directory order as the paths are
> constructed by a script that inspects lib directories.)
> Is anyone aware of classpath order dependencies that could break File
> Upload?
> Can anyone offer any suggestions about what *might* be breaking File Upload?
> Or what other questions I should be asking?  At the moment, I'm feeling
> rather stumped.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (FILEUPLOAD-149) Intermittent file corruption on upload

by JIRA jira@apache.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


    [ https://issues.apache.org/jira/browse/FILEUPLOAD-149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12538226 ]

Jochen Wiedmann commented on FILEUPLOAD-149:
--------------------------------------------

I studied the mail from F. Andy Seidl you are quoting and I must admit that I simply don't believe it. If the problems were as extreme as he states (no successful upload at all), then commons fileupload would be completely useless. Which it is not, because quite a lot of people use it in the wild.


> Intermittent file corruption on upload
> --------------------------------------
>
>                 Key: FILEUPLOAD-149
>                 URL: https://issues.apache.org/jira/browse/FILEUPLOAD-149
>             Project: Commons FileUpload
>          Issue Type: Bug
>    Affects Versions: 1.2
>         Environment: Linux (CentOS) server; all client platforms and browsers
>            Reporter: F. Andy Seidl
>            Priority: Critical
>
> I have been struggling for several weeks trying to track down the root cause
> of a sporadic file corruption problem using File Upload.  I'm really stumped
> at this point and welcome any suggestions as to avenues of debugging
> pursuit.
> Here's overview of the problem:
> I have eight Linux (CentOS) servers all running the same web application--a
> set of Java servlets using Resin as a servlet runner under Apache.  All
> servers were configured using the same script that installs jars,
> config files, etc.
> On three of the servers, File Upload works reliably.  On five of the
> servers, File Upload usually (but not always) leaves me with a corrupted file.
> Corrupted files are always the correct length but contain a relatively small
> percentage of scrambled bytes.  I looked for obvious patterns like newlines
> being altered or high-bit bytes being converted to or from UTF-8, but there
> is no obvious (to me, anyway) pattern in the failure.
> I have also tested with IE, FireFox, and Safari on both Windows and MacOS.
> The issue appears to be independent of client browser and OS.
> Looking at Java system properties, I notice that while the classpath and
> bootclasspath have the same jars lists on all servers, they are not listed
> in the same order (probably listed in directory order as the paths are
> constructed by a script that inspects lib directories.)
> Is anyone aware of classpath order dependencies that could break File
> Upload?
> Can anyone offer any suggestions about what *might* be breaking File Upload?
> Or what other questions I should be asking?  At the moment, I'm feeling
> rather stumped.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (FILEUPLOAD-149) Intermittent file corruption on upload

by JIRA jira@apache.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


    [ https://issues.apache.org/jira/browse/FILEUPLOAD-149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12538229 ]

F. Andy Seidl commented on FILEUPLOAD-149:
------------------------------------------

Hi Jochen:

>>I studied the mail from F. Andy Seidl you are quoting and I must admit that I simply don't believe it. <<
I can understand that, but I assure you, this is the case.  (If you are interested, I can arrange to provide you with credentials to a site you can experience this first-hand.)

I have the same application deployed on eight Linux servers and one Widows server.  File uploading works flawlessly and reliably on three of the Linux servers and the Windows server.  On five of the Linux servers are the symptoms I documented above.

I will try using the current snapshot as you recommend.  But in the mean time, if anyone can think of anything that might cause this server-dependent failure, I'm eager for suggestions.  I've been scratching my head over this one for quite some time.

Thanks.

> Intermittent file corruption on upload
> --------------------------------------
>
>                 Key: FILEUPLOAD-149
>                 URL: https://issues.apache.org/jira/browse/FILEUPLOAD-149
>             Project: Commons FileUpload
>          Issue Type: Bug
>    Affects Versions: 1.2
>         Environment: Linux (CentOS) server; all client platforms and browsers
>            Reporter: F. Andy Seidl
>            Priority: Critical
>
> I have been struggling for several weeks trying to track down the root cause
> of a sporadic file corruption problem using File Upload.  I'm really stumped
> at this point and welcome any suggestions as to avenues of debugging
> pursuit.
> Here's overview of the problem:
> I have eight Linux (CentOS) servers all running the same web application--a
> set of Java servlets using Resin as a servlet runner under Apache.  All
> servers were configured using the same script that installs jars,
> config files, etc.
> On three of the servers, File Upload works reliably.  On five of the
> servers, File Upload usually (but not always) leaves me with a corrupted file.
> Corrupted files are always the correct length but contain a relatively small
> percentage of scrambled bytes.  I looked for obvious patterns like newlines
> being altered or high-bit bytes being converted to or from UTF-8, but there
> is no obvious (to me, anyway) pattern in the failure.
> I have also tested with IE, FireFox, and Safari on both Windows and MacOS.
> The issue appears to be independent of client browser and OS.
> Looking at Java system properties, I notice that while the classpath and
> bootclasspath have the same jars lists on all servers, they are not listed
> in the same order (probably listed in directory order as the paths are
> constructed by a script that inspects lib directories.)
> Is anyone aware of classpath order dependencies that could break File
> Upload?
> Can anyone offer any suggestions about what *might* be breaking File Upload?
> Or what other questions I should be asking?  At the moment, I'm feeling
> rather stumped.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (FILEUPLOAD-149) Intermittent file corruption on upload

by JIRA jira@apache.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


    [ https://issues.apache.org/jira/browse/FILEUPLOAD-149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12588754#action_12588754 ]

F. Andy Seidl commented on FILEUPLOAD-149:
------------------------------------------

Hi Jochen (and others):

I recently re-tried FileUpload on the affected Linux servers after upgrading to the latest release version of commons-io, but the issues remain.  I realize that Jochen does not believe these results, but I remain stumped as to why the same code runs on most of my servers but corrupts files on others.  It must be some type of environmental issue, but like what? This is particularly nasty for me to debug (lacking any guidance or hypotheses) because I *only* see failures on a small number of production machines.  I have not been able to reproduce these failures on a development system.  As such, it is a very slow debugging cycle to deploy debugging code and do bug hunting on a production server.

*Any* suggestions as to what might create the type of corruptions I am seeing (and I really am seeing them) would be greatly appreciated.

Thank you,
  -- fas

> Intermittent file corruption on upload
> --------------------------------------
>
>                 Key: FILEUPLOAD-149
>                 URL: https://issues.apache.org/jira/browse/FILEUPLOAD-149
>             Project: Commons FileUpload
>          Issue Type: Bug
>    Affects Versions: 1.2
>         Environment: Linux (CentOS) server; all client platforms and browsers
>            Reporter: F. Andy Seidl
>            Priority: Critical
>
> I have been struggling for several weeks trying to track down the root cause
> of a sporadic file corruption problem using File Upload.  I'm really stumped
> at this point and welcome any suggestions as to avenues of debugging
> pursuit.
> Here's overview of the problem:
> I have eight Linux (CentOS) servers all running the same web application--a
> set of Java servlets using Resin as a servlet runner under Apache.  All
> servers were configured using the same script that installs jars,
> config files, etc.
> On three of the servers, File Upload works reliably.  On five of the
> servers, File Upload usually (but not always) leaves me with a corrupted file.
> Corrupted files are always the correct length but contain a relatively small
> percentage of scrambled bytes.  I looked for obvious patterns like newlines
> being altered or high-bit bytes being converted to or from UTF-8, but there
> is no obvious (to me, anyway) pattern in the failure.
> I have also tested with IE, FireFox, and Safari on both Windows and MacOS.
> The issue appears to be independent of client browser and OS.
> Looking at Java system properties, I notice that while the classpath and
> bootclasspath have the same jars lists on all servers, they are not listed
> in the same order (probably listed in directory order as the paths are
> constructed by a script that inspects lib directories.)
> Is anyone aware of classpath order dependencies that could break File
> Upload?
> Can anyone offer any suggestions about what *might* be breaking File Upload?
> Or what other questions I should be asking?  At the moment, I'm feeling
> rather stumped.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (FILEUPLOAD-149) Intermittent file corruption on upload

by JIRA jira@apache.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


    [ https://issues.apache.org/jira/browse/FILEUPLOAD-149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12594692#action_12594692 ]

F. Andy Seidl commented on FILEUPLOAD-149:
------------------------------------------

Hi All,

I'm back to debugging this FileUpload corruption issue.  I've just installed commons-fileupload-1.2.1 and commons-io-1.4 and ran a simple test...

I started with a local .jpg image, fileupload-test.jpg, and created two copies of that file: fileupload-test-copy.jpg and fileupload-test-copy-2.jpg.  I uploaded the first two using FileUpload here:

http://faseidl.com/docs/fileupload-test.jpg
http://faseidl.com/docs/fileupload-test-copy.jpg

I FTP-ed the third file here:

http://faseidl.com/docs/fileupload-test-copy-2.jpg

As you can see, the FileUpload-ed files are corrupted while the FTP-ed file is not.

This situation occurs on five different servers while things work great on six others.  All hypotheses welcome.


> Intermittent file corruption on upload
> --------------------------------------
>
>                 Key: FILEUPLOAD-149
>                 URL: https://issues.apache.org/jira/browse/FILEUPLOAD-149
>             Project: Commons FileUpload
>          Issue Type: Bug
>    Affects Versions: 1.2
>         Environment: Linux (CentOS) server; all client platforms and browsers
>            Reporter: F. Andy Seidl
>            Priority: Critical
>
> I have been struggling for several weeks trying to track down the root cause
> of a sporadic file corruption problem using File Upload.  I'm really stumped
> at this point and welcome any suggestions as to avenues of debugging
> pursuit.
> Here's overview of the problem:
> I have eight Linux (CentOS) servers all running the same web application--a
> set of Java servlets using Resin as a servlet runner under Apache.  All
> servers were configured using the same script that installs jars,
> config files, etc.
> On three of the servers, File Upload works reliably.  On five of the
> servers, File Upload usually (but not always) leaves me with a corrupted file.
> Corrupted files are always the correct length but contain a relatively small
> percentage of scrambled bytes.  I looked for obvious patterns like newlines
> being altered or high-bit bytes being converted to or from UTF-8, but there
> is no obvious (to me, anyway) pattern in the failure.
> I have also tested with IE, FireFox, and Safari on both Windows and MacOS.
> The issue appears to be independent of client browser and OS.
> Looking at Java system properties, I notice that while the classpath and
> bootclasspath have the same jars lists on all servers, they are not listed
> in the same order (probably listed in directory order as the paths are
> constructed by a script that inspects lib directories.)
> Is anyone aware of classpath order dependencies that could break File
> Upload?
> Can anyone offer any suggestions about what *might* be breaking File Upload?
> Or what other questions I should be asking?  At the moment, I'm feeling
> rather stumped.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (FILEUPLOAD-149) Intermittent file corruption on upload

by JIRA jira@apache.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


    [ https://issues.apache.org/jira/browse/FILEUPLOAD-149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12594768#action_12594768 ]

F. Andy Seidl commented on FILEUPLOAD-149:
------------------------------------------

Very interesting results.  Does this trigger any ideas?

My code writes uploaded files do disk by direclty copying the FileItem's
input stream to the final file location.  As an experiment to rule out a
problem in the this copy operation, I createed a small (1112 byte) .jpg
image file named fu-0.jpg.  Then I created nine exact copies of the
file: fu-1.jpg, fu-2.jpg, ..., fu-9.jpg.

I FTP-ed fu-0.jpg to the server and I uploaded files 1-9 using
FileUpload in a single upload request with nine files attached in
numeric order.  All the files uploaded without error except files 4 and
5.

Here is where it gets interesting...

Look at the sizes of the uploaded files:

fu-0.jpg: 1112
fu-1.jpg: 1112
fu-2.jpg: 1112
fu-3.jpg: 1112
fu-4.jpg: 1155   <=== wrong size
fu-5.jpg: 1069   <=== wrong size
fu-6.jpg: 1112
fu-7.jpg: 1112
fu-8.jpg: 1112
fu-9.jpg: 1112

BUT... file 4 and 5 add up to 2224 bytes, which is exactly two times the
correct size of 1112 bytes.  Thus, a worthy hypothesis is that the code
that parsed the raw servlet request stream somehow missed the the
boundary between the 4th and 5th file items.

To rule out an alternate hypothesis that this was a random glitch
unrelated to recognizing item boundaries, I repeated the test with ten
new files, fu-a.jpg, fu-b.jpg, ..., fu-i.jpg.  But this time, instead
FileUpload-ing nine files at once, I FileUploaded each file separately.
In this case, all ten files uploaded without error.

> Intermittent file corruption on upload
> --------------------------------------
>
>                 Key: FILEUPLOAD-149
>                 URL: https://issues.apache.org/jira/browse/FILEUPLOAD-149
>             Project: Commons FileUpload
>          Issue Type: Bug
>    Affects Versions: 1.2
>         Environment: Linux (CentOS) server; all client platforms and browsers
>            Reporter: F. Andy Seidl
>            Priority: Critical
>
> I have been struggling for several weeks trying to track down the root cause
> of a sporadic file corruption problem using File Upload.  I'm really stumped
> at this point and welcome any suggestions as to avenues of debugging
> pursuit.
> Here's overview of the problem:
> I have eight Linux (CentOS) servers all running the same web application--a
> set of Java servlets using Resin as a servlet runner under Apache.  All
> servers were configured using the same script that installs jars,
> config files, etc.
> On three of the servers, File Upload works reliably.  On five of the
> servers, File Upload usually (but not always) leaves me with a corrupted file.
> Corrupted files are always the correct length but contain a relatively small
> percentage of scrambled bytes.  I looked for obvious patterns like newlines
> being altered or high-bit bytes being converted to or from UTF-8, but there
> is no obvious (to me, anyway) pattern in the failure.
> I have also tested with IE, FireFox, and Safari on both Windows and MacOS.
> The issue appears to be independent of client browser and OS.
> Looking at Java system properties, I notice that while the classpath and
> bootclasspath have the same jars lists on all servers, they are not listed
> in the same order (probably listed in directory order as the paths are
> constructed by a script that inspects lib directories.)
> Is anyone aware of classpath order dependencies that could break File
> Upload?
> Can anyone offer any suggestions about what *might* be breaking File Upload?
> Or what other questions I should be asking?  At the moment, I'm feeling
> rather stumped.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (FILEUPLOAD-149) Intermittent file corruption on upload

by JIRA jira@apache.org :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


     [ https://issues.apache.org/jira/browse/FILEUPLOAD-149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

F. Andy Seidl resolved FILEUPLOAD-149.
--------------------------------------

    Resolution: Invalid

While the symptoms were real, the cause was outside (below) FileUpload.   Specifically, within MultipartStream.makeAvailable() , the following function was violating its contract:

int bytesRead = input.read(buffer, tail, bufSize- tail);

Under certain not-understood but repeatable circumstances, the call to read() was acting as if tail==0 when it actually was non-zero (i.e., tail==pad).  When this occurred, new data was written to the front of the buffer instead of tail bytes into the buffer.  The result was tail bytes of data being dropped from the buffer, the new data being shifted to the front of the buffer, and tail bytes of garbage being re-used at the end of the buffer--in short, a corrupted file that retained its correct size (except for the even rarer occurrences when  the lost bytes contained a boundary separator.)

Ultimately the problem was in the implementation of the input stream class, which in this case, was com.caucho.server.connection.ServletInputStreamImpl, which is supplied by Resin.

I did not determine under what circumstances this stream implementation was failing, but noting that read() was always returning 1460 bytes (a TCP packet size), even though the buffer was 4K, suggested that this was not a buffered implementation. I wrapped the raw input stream in a BufferedInputStream (by overriding getInputStream in ServletRequestContext) and the problems disappeared.

> Intermittent file corruption on upload
> --------------------------------------
>
>                 Key: FILEUPLOAD-149
>                 URL: https://issues.apache.org/jira/browse/FILEUPLOAD-149
>             Project: Commons FileUpload
>          Issue Type: Bug
>    Affects Versions: 1.2
>         Environment: Linux (CentOS) server; all client platforms and browsers
>            Reporter: F. Andy Seidl
>            Priority: Critical
>
> I have been struggling for several weeks trying to track down the root cause
> of a sporadic file corruption problem using File Upload.  I'm really stumped
> at this point and welcome any suggestions as to avenues of debugging
> pursuit.
> Here's overview of the problem:
> I have eight Linux (CentOS) servers all running the same web application--a
> set of Java servlets using Resin as a servlet runner under Apache.  All
> servers were configured using the same script that installs jars,
> config files, etc.
> On three of the servers, File Upload works reliably.  On five of the
> servers, File Upload usually (but not always) leaves me with a corrupted file.
> Corrupted files are always the correct length but contain a relatively small
> percentage of scrambled bytes.  I looked for obvious patterns like newlines
> being altered or high-bit bytes being converted to or from UTF-8, but there
> is no obvious (to me, anyway) pattern in the failure.
> I have also tested with IE, FireFox, and Safari on both Windows and MacOS.
> The issue appears to be independent of client browser and OS.
> Looking at Java system properties, I notice that while the classpath and
> bootclasspath have the same jars lists on all servers, they are not listed
> in the same order (probably listed in directory order as the paths are
> constructed by a script that inspects lib directories.)
> Is anyone aware of classpath order dependencies that could break File
> Upload?
> Can anyone offer any suggestions about what *might* be breaking File Upload?
> Or what other questions I should be asking?  At the moment, I'm feeling
> rather stumped.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.