Discussion point: CONSPEC - Context-specific Issues

View: New views
20 Messages — Rating Filter:   Alert me  
< Prev | 1 - 2 | Next >

Discussion point: CONSPEC - Context-specific Issues

by Steven M. Christey-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Below is the current writeup for Context-specific issues, as also seen
on the CWE web site:

  http://cwe.mitre.org/community/research/discussion/conspec.html

A list of potentially affected nodes is here, although I see that the
list is incomplete and has a couple items that don't belong (I'll work
to clean it up):

  http://cwe.mitre.org/community/research/discussion/reports/rpt-conspec.html

Please review the potential solutions and let us know if our
recommendations will work for you (or not).

In a separate post, I'll provide a detailed writeup of the kinds of
changes that would be made to the CWE nodes, based on the topics
listed in this discussion point.

Please post your comments to this list.


Thank you,
Steve


================================
CONSPEC: Context-specific Issues
================================

Some issues are generally thought of to be "bad practice" or misuse,
but they can be used in certain contexts that are perfectly
legitimate.

Examples:

   CWE-481 - Assigning instead of comparing
   CWE-482 - Comparing instead of assigning
   CWE-486 - Comparing Classes by Name
   CWE-568 - Erroneous Finalize Method
   CWE-572 - Calling Thread.run() instead of Thread.start()
   CWE-597 - Erroneous String Compare

Code:

        char * a,b;
        ...
        if (a == b) {...} /* 1 */
        if (! strcmp(a,b)) {...} /* 2 */


Issue 1: Inclusion
------------------

Any possible security-related impact of these issues is dependent on
the context and policy of their use. CWE-597 might be flagged as an
issue, as in the above code, but it really depends on the context and
the programmer's intent. Comparing strings using == is only
problematic if the programmer meant to compare the strings and not
their pointers. Likewise, just because strcmp() is used in the context
of comparing two char *, it does not mean that the programmer didn't
intend to compare the pointer references. Furthermore, in some
languages, such as Python, == is the recommended way of testing string
equivalence.

With CWE-572, there is no issue if the programmer intended for the
run() function to operate in a synchronous context. Since these are
hard to identify might depend on external operational contexts, they
may lead to high false positive or false negative rates in detection
and should not be included as a weakness.

With CWE-481, it is a common idiom to perform a variable assignment
within a conditional. While this might be regarded as a risky
practice, many such constructs are performed correctly, even if
automated tools might flag them as unintentional comparisons.


Issue 2: Abstraction
--------------------

It could be argued that these are all issues in which the programmer
is not doing what is intended. Therefore, these might be restructured
under a general context-specific grouping. Similarly, it could be
argued that these are all function or language-specific issues and
should be restructured or merged into that issue. Note: the mechanisms
for node restructuring are still being defined.


Possible Solutions
------------------

1) Abstract all context-dependent issues under a context-dependent
   parent to indicate the potential problems when dealing with these
   weaknesses, making the current CWEs its children.

2) Treat these as a case of language-specific issues.

3) Leave the context-dependent entries in place in CWE, incorporating
   caveats as to their use, and highlight context-dependent
   circumstances.

4) MERGE all context-dependent entries under a single abstract parent
   (see the [45]node restructuring page for possible approaches).


Relevant Use-Cases
------------------

Assessment Vendors: Fairly easy to identify potential code issues, but
   hard to judge context and programmer intent. Nonetheless, they
   could be flagged for further analysis by a developer, although it
   could lead to a lot of false positives and/or false negatives.

Assessment Customers: Useful when interpreting potential issues
   flagged by vendors.

Educators: Identify common language mistakes and misinterpretations to
   use to highlight specific instances.

Academic Researchers: Identify ambiguous aspects of languages that
   allow programmers to misuse features, whether intentional or
   unintentional.

Applied Vulnerability Researchers: Pinpoint areas to look at as
   context can more easily be determined and problems identified,
   especially since the code scanners may make many mistakes and
   developer confusion/syntactic similarities could cause these issues
   to go unnoticed.

Refined Vulnerability Information (RVI) Providers: Not applicable.

Software Customers: Not applicable.

Software Developers: Help to identify certain context-dependent issues
   that are commonly missed and might require extra attention.


Recommendation
--------------

The CWE Researcher Community is strongly encouraged to provide
feedback to the CWE team or the researchers list regarding this
recommendation.

To provide stability, logical categorization, and maximize the overall
usability by all of the potential CWE customers, the MITRE CWE team
recommends to keep the current CWEs and their locations as they are.
Any descriptions should be abstracted to remove language-specific
inclinations where applicable, and any usage or context-specific
caveats should be called out where necessary. These could be treated
as examples of "sub-nodes." Note: the mechanisms for node
restructuring are still being defined; see the node restructuring page
for possible approaches.

For example, this would mean the CWE-597 (Erroneous String Compare)
entry would be described as "strings should be compared by content and
not by references." This means that functions like String.equals()
(Java) and strcmp (C) should be used for string comparison, instead of
== or !=. In some languages, like Python, comparing string contents is
performed via ==, as might be expected. As a caveat, if the program is
intending to compare string references or pointers, then == and != are
the correct comparison operators.


Notes
-----

Below is preliminary work done in order to more clearly identify
problems present in CWE. Any issues not addressed above should be
brought to the attention of the whole list, especially if the CWE ID
is missing from the notes below.

Types of context-specific issues:

     * failure to adhere to a tech-specific specification
       568, 574, 575, 576, 577, 578, 580, 581, 582, 583

     * use of unsafe or error-prone constructs
       584, 586, 587, 588

     * legitimate functionality that's only a security concern
       relevant in privileged or other limited contexts
       570, 571, 589, 595, 597, 598, 8, 9, 481, 482


Complete List of Examples
-------------------------

All CWE nodes that are affected by this discussion point are listed on
a separate page:

  http://cwe.mitre.org/community/research/discussion/reports/rpt-conspec.html

Re: Discussion point: CONSPEC - Context-specific Issues

by Pascal Meunier-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Steven M. Christey wrote:

>
> Possible Solutions
> ------------------
>
> 1) Abstract all context-dependent issues under a context-dependent
>    parent to indicate the potential problems when dealing with these
>    weaknesses, making the current CWEs its children.
>
> 2) Treat these as a case of language-specific issues.
>


I'd like best a combination of 1 & 2, but I also see a 5th possibility
(see below).  I think it would be very useful to have a "view" of the
CWE that branches out on the language used, or that is filtered based on
language (I'm getting ahead of #5).  I could also see a case made for a
"language-independent" node.  IMO the major reasons to do this are:

i. Many people are interested in the issue of comparing languages.  How
a language fares in not placing unnecessary pitfalls and traps, or
unwieldy to use securely, is of interest.  Many best practices issues
are language dependent, if not caused by syntactic or semantic issues
within the language.  Language-independent issues should be emphasized
in secure programming courses, as covering these will likely have the
greatest ROI.

ii. It would reduce false positives, or make it easier to filter reports
based on the language used.

iii. It would be less confusing for programmers trying to learn secure
programming best practices, for example when adopting a different
language, or when trying to identify issues in their programs.  This
means that the CWE would be easier to use, more practical and relevant.
  For example, there are other popular languages besides Python in which
== is the proper operator for string comparison.

It wouldn't bother me to find the same CWE node as a child of two
different problematic languages at the same time.

There is also a 5th possibility:

5) Use tags.  CWE issues could be tagged with language names that are
affected and not affected by the issues.  Having both lists of affected
and not affected would make it clear if a specific language has not been
evaluated for that CWE issue.  The tag mechanism could be also used for
other means, such as a primitive mechanism for generating CWE "views",
and for searching of course.  For example, "prune branches not
containing tag XYZ" or "remove CWE nodes (collapsing the tree) not
containing tag XYZ".  In addition, MITRE could allow user-defined tags.

Cheers,
Pascal

Re: Discussion point: CONSPEC - Context-specific Issues

by Steven M. Christey-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Tue, 18 Sep 2007, pmeunier wrote:

> I think it would be very useful to have a "view" of the
> CWE that branches out on the language used, or that is filtered based on
> language (I'm getting ahead of #5).

We definitely see this too; in fact, a language-specific view is the first
one listed here:

  http://cwe.mitre.org/community/research/views.html



> I could also see a case made for a "language-independent" node.

My hope would be that most nodes would have a language-independent aspect
to them, perhaps as a parent.  It's not necessarily CWE's job to try to
determine what these parents would be (see some of the CONSPEC change
notes, e.g. CWE-568 and CWE-484) but it would be good, not to mention
educational, to do this where we can.

>  IMO the major reasons to do this are:
>
> i. Many people are interested in the issue of comparing languages.  How
> a language fares in not placing unnecessary pitfalls and traps, or
> unwieldy to use securely, is of interest.

I've been thinking a little bit about what a language-specific view might
be like in the context of comparing languages.  It seems to me that you
don't want to ignore the language-independent issues.  For example, "OS
Command Injection" is language-independent, but shell metacharacter
injection is more "naturally" avoided in Visual C on Windows than it is in
C on Unix, because (as I understand it) CreateProcess(), which is heavily
used in Windows, only executes a single command.

That said, at least being able to identify the differences between
languages would be a good start.

> iii. It would be less confusing for programmers trying to learn secure
> programming best practices, for example when adopting a different
> language, or when trying to identify issues in their programs.

Do you mean something like this: "I'm learning a new language.  What are
the specific things I need to worry about?"  That's one of the
applications we see for a language-specific view.

> There is also a 5th possibility:
>
> 5) Use tags.  CWE issues could be tagged with language names that are
> affected and not affected by the issues.

We currently have a poorly-named attribute "Platform" which mostly lists
different languages, although it's not as well-populated as we'd like.
We think this attribute might need to be changed in the future to handle
similar concepts.

> Having both lists of affected and not affected would make it clear if a
> specific language has not been evaluated for that CWE issue.  The tag
> mechanism could be also used for other means, such as a primitive
> mechanism for generating CWE "views", and for searching of course.  For
> example, "prune branches not containing tag XYZ" or "remove CWE nodes
> (collapsing the tree) not containing tag XYZ".  In addition, MITRE could
> allow user-defined tags.

We're not yet sure how exactly we'll be "implementing" views.  It would be
nice to do it in a well-structured fashion to facilitate as much automatic
data handling as possible, but in the early stages, it might be more
lightweight to implement less-structured tags.  Thanks for the ideas!

- Steve

RE: Discussion point: CONSPEC - Context-specific Issues

by Sean Barnum :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I very much agree with the recommendation here.
These nodes should stay where they are but potentially have their descriptions tweaked to explain any context-specific characteristics.

Sean Barnum

-----Original Message-----
From: owner-cwe-research-list@... [mailto:owner-cwe-research-list@...] On Behalf Of Steven M. Christey
Sent: Monday, September 17, 2007 8:06 PM
To: CWE-RESEARCH-LIST@...
Subject: Discussion point: CONSPEC - Context-specific Issues

Below is the current writeup for Context-specific issues, as also seen
on the CWE web site:

  http://cwe.mitre.org/community/research/discussion/conspec.html

A list of potentially affected nodes is here, although I see that the
list is incomplete and has a couple items that don't belong (I'll work
to clean it up):

  http://cwe.mitre.org/community/research/discussion/reports/rpt-conspec.html

Please review the potential solutions and let us know if our
recommendations will work for you (or not).

In a separate post, I'll provide a detailed writeup of the kinds of
changes that would be made to the CWE nodes, based on the topics
listed in this discussion point.

Please post your comments to this list.


Thank you,
Steve


================================
CONSPEC: Context-specific Issues
================================

Some issues are generally thought of to be "bad practice" or misuse,
but they can be used in certain contexts that are perfectly
legitimate.

Examples:

   CWE-481 - Assigning instead of comparing
   CWE-482 - Comparing instead of assigning
   CWE-486 - Comparing Classes by Name
   CWE-568 - Erroneous Finalize Method
   CWE-572 - Calling Thread.run() instead of Thread.start()
   CWE-597 - Erroneous String Compare

Code:

        char * a,b;
        ...
        if (a == b) {...} /* 1 */
        if (! strcmp(a,b)) {...} /* 2 */


Issue 1: Inclusion
------------------

Any possible security-related impact of these issues is dependent on
the context and policy of their use. CWE-597 might be flagged as an
issue, as in the above code, but it really depends on the context and
the programmer's intent. Comparing strings using == is only
problematic if the programmer meant to compare the strings and not
their pointers. Likewise, just because strcmp() is used in the context
of comparing two char *, it does not mean that the programmer didn't
intend to compare the pointer references. Furthermore, in some
languages, such as Python, == is the recommended way of testing string
equivalence.

With CWE-572, there is no issue if the programmer intended for the
run() function to operate in a synchronous context. Since these are
hard to identify might depend on external operational contexts, they
may lead to high false positive or false negative rates in detection
and should not be included as a weakness.

With CWE-481, it is a common idiom to perform a variable assignment
within a conditional. While this might be regarded as a risky
practice, many such constructs are performed correctly, even if
automated tools might flag them as unintentional comparisons.


Issue 2: Abstraction
--------------------

It could be argued that these are all issues in which the programmer
is not doing what is intended. Therefore, these might be restructured
under a general context-specific grouping. Similarly, it could be
argued that these are all function or language-specific issues and
should be restructured or merged into that issue. Note: the mechanisms
for node restructuring are still being defined.


Possible Solutions
------------------

1) Abstract all context-dependent issues under a context-dependent
   parent to indicate the potential problems when dealing with these
   weaknesses, making the current CWEs its children.

2) Treat these as a case of language-specific issues.

3) Leave the context-dependent entries in place in CWE, incorporating
   caveats as to their use, and highlight context-dependent
   circumstances.

4) MERGE all context-dependent entries under a single abstract parent
   (see the [45]node restructuring page for possible approaches).


Relevant Use-Cases
------------------

Assessment Vendors: Fairly easy to identify potential code issues, but
   hard to judge context and programmer intent. Nonetheless, they
   could be flagged for further analysis by a developer, although it
   could lead to a lot of false positives and/or false negatives.

Assessment Customers: Useful when interpreting potential issues
   flagged by vendors.

Educators: Identify common language mistakes and misinterpretations to
   use to highlight specific instances.

Academic Researchers: Identify ambiguous aspects of languages that
   allow programmers to misuse features, whether intentional or
   unintentional.

Applied Vulnerability Researchers: Pinpoint areas to look at as
   context can more easily be determined and problems identified,
   especially since the code scanners may make many mistakes and
   developer confusion/syntactic similarities could cause these issues
   to go unnoticed.

Refined Vulnerability Information (RVI) Providers: Not applicable.

Software Customers: Not applicable.

Software Developers: Help to identify certain context-dependent issues
   that are commonly missed and might require extra attention.


Recommendation
--------------

The CWE Researcher Community is strongly encouraged to provide
feedback to the CWE team or the researchers list regarding this
recommendation.

To provide stability, logical categorization, and maximize the overall
usability by all of the potential CWE customers, the MITRE CWE team
recommends to keep the current CWEs and their locations as they are.
Any descriptions should be abstracted to remove language-specific
inclinations where applicable, and any usage or context-specific
caveats should be called out where necessary. These could be treated
as examples of "sub-nodes." Note: the mechanisms for node
restructuring are still being defined; see the node restructuring page
for possible approaches.

For example, this would mean the CWE-597 (Erroneous String Compare)
entry would be described as "strings should be compared by content and
not by references." This means that functions like String.equals()
(Java) and strcmp (C) should be used for string comparison, instead of
== or !=. In some languages, like Python, comparing string contents is
performed via ==, as might be expected. As a caveat, if the program is
intending to compare string references or pointers, then == and != are
the correct comparison operators.


Notes
-----

Below is preliminary work done in order to more clearly identify
problems present in CWE. Any issues not addressed above should be
brought to the attention of the whole list, especially if the CWE ID
is missing from the notes below.

Types of context-specific issues:

     * failure to adhere to a tech-specific specification
       568, 574, 575, 576, 577, 578, 580, 581, 582, 583

     * use of unsafe or error-prone constructs
       584, 586, 587, 588

     * legitimate functionality that's only a security concern
       relevant in privileged or other limited contexts
       570, 571, 589, 595, 597, 598, 8, 9, 481, 482


Complete List of Examples
-------------------------

All CWE nodes that are affected by this discussion point are listed on
a separate page:

  http://cwe.mitre.org/community/research/discussion/reports/rpt-conspec.html

Path Issue - Triple Dot - '...'

by Robert C. Seacord :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


The context notes for this CWE leaf node (http://cwe.mitre.org/data/definitions/32.html) says the following:

Context Notes

This manipulation is effective in two different contexts: (1) it is equivalent to "..\.." on Windows, or (2) it can take advantage of insufficient filtering, e.g. if the programmer does a single-pass removal of "./" in a string (collapse of data into unsafe value)

I have not been able to use "..." in place of "..\.." on any of my windows systems.  Where is this an issue?

As a more general comment--have you given any thought to collapsing some of these together?  There seem to be an awful lot of nuanced distinctions.  For example, if you were to introduce the term "separator character" which could be equal to '\' or '/' you could quickly eliminate a number of leafs in this section.

Thanks,
rCs

-- 
Robert C. Seacord
Senior Vulnerability Analyst
CERT/CC 

Work: 412-268-7608
FAX: 412-268-6989

Re: Path Issue - Triple Dot - '...'

by Pascal Meunier-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

In Windows 95, 98, "..." goes up two directories.

http://projects.cerias.purdue.edu/secprog/class2/7.Canon_&_DT.pdf (my
slides, apologies for the self-citation)

according to
http://www.iss.net/security_center/advice/Intrusions/2000617/default.htm

even 4 dots are possible, going up three directories.

Pascal


Robert C. Seacord wrote:

>
> The context notes for this CWE leaf node
> (http://cwe.mitre.org/data/definitions/32.html) says the following:
>
> *Context Notes*
>
> This manipulation is effective in two different contexts: (1) it is
> equivalent to "..\.." on Windows, or (2) it can take advantage of
> insufficient filtering, e.g. if the programmer does a single-pass
> removal of "./" in a string (collapse of data into unsafe value)
>
> I have not been able to use "..." in place of "..\.." on any of my
> windows systems.  Where is this an issue?
>
> As a more general comment--have you given any thought to collapsing some
> of these together?  There seem to be an awful lot of nuanced
> distinctions.  For example, if you were to introduce the term "separator
> character" which could be equal to '\' or '/' you could quickly
> eliminate a number of leafs in this section.
>
> Thanks,
> rCs
>
> --
> Robert C. Seacord
> Senior Vulnerability Analyst
> CERT/CC
>
> Work: 412-268-7608
> FAX: 412-268-6989
>

Patr Traversal and some challenges for CWE

by Steven M. Christey-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I'm changing the subject line because Robert's question touches on some
systemic CWE issues, and we'd really like the CWE community to give us
their thoughts.


Robert C. Seacord asked:

> I have not been able to use "..." in place of "..\.." on any of my
> windows systems.  Where is this an issue?

I believe this applied to older Windows systems, at least NT.  I
distinctly remember testing it, because I couldn't believe it was true :)

> As a more general comment--have you given any thought to collapsing some
> of these together?  There seem to be an awful lot of nuanced
> distinctions.  For example, if you were to introduce the term "separator
> character" which could be equal to '\' or '/' you could quickly
> eliminate a number of leafs in this section.

We've done a little bit of thinking on this.  The path traversal nodes are
one part of the tree that seems excessively deep.

Note that the current path traversal leaves, such as "..." and "....//",
are all attack-focused.  They concentrate on specific manipulations that
are nonetheless a little different from each other.

There are a few different ways this could be handled.

One would be to abstract the leaf nodes like so:

  - merge of '/' or '\' as "directory separators".  Issue: this doesn't
    account for the frequent occurrence where an application accounts for
    one separator but not the other, especially in Windows systems where
    both "/" and "\" often work.

  - merge of leading/trailing/internal separators.  Issue: as with / and
    \, sometimes you'll encounter where one works but not the other

  - merge multiple dots or other doubled sequences.  Issue: well, you get
    the drift.


A second approach is to focus more on the underlying weaknesses, and try
to infer WHY these manipulations really work.

In that sense, we might have something like:

  - the canonical example - "../"

  - failure to account for syntactic equivalence.  This might address
    multiple internal "//" and perhaps "..."

  - failure to account for OS-specific variants.  In relative path
    traversal, this might be "/" and "\" (Unix vs. Windows), as well as
    "..." and "C:\drive\letter" (Windows only)

  - protection mechanism failure (PMF): an attempt to protect against
    traversal is incorrect or incomplete.  Ideally, each PMF would have
    its own CWE as well.  But PMFs get into the whole complicated notion
    of weakness chains (a la integer-overflow-leads-to-buffer-overflow),
    and I'm not sure how we could best handle chains within CWE.
    However, it's a good topic to explore.

    At any rate, for example, "....//" might fall under this category,
    since it's not valid to the OS as-is, but a collapse-into-unsafe-value
    error (CWE-182) might reduce this string into "../" which then enables
    the traversal to take place.

As you can see, there are a few options.  The second approach, while more
weakness-focused, still has overlapping categories.

===== So what should happen to the current low-level leaf nodes? =====

If it's decided that these are too low-level, or the wrong perspective,
then some options for handling them are covered at
http://cwe.mitre.org/community/research/restructuring.html

But the question still remains about what to do.

Some CWE consumers might advocate that the existing low-level nodes should
stay as they are, because there are some cases where this precision would
be useful, such as in a pen-test or black-box scanning situation where you
would want to ensure broad coverage, *or* if a programmer is implementing
a protection mechanism and wants to figure out what potential problems
could arise.  However, these nodes, as currently described, are inherently
attack-focused.  In addition, you wind up with a risk of combinatorial
explosion if you try to ensure that CWE's coverage is complete in this
area.  Path traversal has been a favorite vulnerability of mine for years,
so I've studied it pretty closely, and I suspect that even the existing
set of leaf nodes is incomplete.

Others would think that a middle ground might be more appropriate.  This
makes sense for a few use-cases, including general education, as well as
characterizing code analysis tool capabilities - they mostly only spot
"pathname can be manipulated", so all (or most) leaf nodes would apply.

Note that CWE contains a LOT of nodes that are like this.  XSS (CWE-79),
Buffer Errors (CWE-119), and Information Leaks (CWE-200) are some other
examples.

===== The larger questions =====

The path traversal nodes illustrate some larger questions that we face for
CWE:

1) When there are competing perspectives, which ones do we adopt?  The
   use-cases we defined at
   http://cwe.mitre.org/community/research/stakeholders.html, plus
   feedback from the CWE research community (i.e. you), will help us to
   decide this.

2) If we adopt multiple perspectives, then how do we modify CWE so that it
   supports these perspectives, both internally within the XML, and
   externally in terms of presenting them to CWE consumers without
   confusing them?  Hopefully, views will be able to account for these,
   but there's also the question of resources/labor and where we direct
   our efforts (although community involvement might help here, too).

3) Do we aim for theoretical completeness at a low level for CWE, which
   could translate into combinatorial explosions and perhaps tens of
   thousands of CWEs, or do we live with incomplete nodes and the
   associated biases that they would indirectly introduce into
   quantitative comparisons?  Are there ways of handling the combinatorics
   without producing an excessively large set of CWEs?  (Some other MITRE
   initiatives such as CCE have their own ways of dealing with
   combinatorics.)


You can see some of these questions "in action" on the discussion points
page on the CWE web site, at
http://cwe.mitre.org/community/research/content_discussion.html

- Steve

Re: Patr Traversal and some challenges for CWE

by Pascal Meunier-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I've wondered why a regular expression or grammar couldn't be used
sometimes in the CWE, to bring combinatorial explosions under control.
Isn't it sufficient, at least for the "pen-test or black-box scanning
situation" that the product fails when given one of the possible input
instances, and that the testers know how to generate them automatically?

It seems to me that it would also work just fine for "general education,
as well as characterizing code analysis tool capabilities."


Steven M. Christey wrote:

> I'm changing the subject line because Robert's question touches on some
> systemic CWE issues, and we'd really like the CWE community to give us
> their thoughts.
>
>
> Robert C. Seacord asked:
>
>> I have not been able to use "..." in place of "..\.." on any of my
>> windows systems.  Where is this an issue?
>
> I believe this applied to older Windows systems, at least NT.  I
> distinctly remember testing it, because I couldn't believe it was true :)
>
>> As a more general comment--have you given any thought to collapsing some
>> of these together?  There seem to be an awful lot of nuanced
>> distinctions.  For example, if you were to introduce the term "separator
>> character" which could be equal to '\' or '/' you could quickly
>> eliminate a number of leafs in this section.
>
> We've done a little bit of thinking on this.  The path traversal nodes are
> one part of the tree that seems excessively deep.
>
> Note that the current path traversal leaves, such as "..." and "....//",
> are all attack-focused.  They concentrate on specific manipulations that
> are nonetheless a little different from each other.
>
> There are a few different ways this could be handled.
>
> One would be to abstract the leaf nodes like so:
>
>   - merge of '/' or '\' as "directory separators".  Issue: this doesn't
>     account for the frequent occurrence where an application accounts for
>     one separator but not the other, especially in Windows systems where
>     both "/" and "\" often work.
>
>   - merge of leading/trailing/internal separators.  Issue: as with / and
>     \, sometimes you'll encounter where one works but not the other
>
>   - merge multiple dots or other doubled sequences.  Issue: well, you get
>     the drift.
>
>
> A second approach is to focus more on the underlying weaknesses, and try
> to infer WHY these manipulations really work.
>
> In that sense, we might have something like:
>
>   - the canonical example - "../"
>
>   - failure to account for syntactic equivalence.  This might address
>     multiple internal "//" and perhaps "..."
>
>   - failure to account for OS-specific variants.  In relative path
>     traversal, this might be "/" and "\" (Unix vs. Windows), as well as
>     "..." and "C:\drive\letter" (Windows only)
>
>   - protection mechanism failure (PMF): an attempt to protect against
>     traversal is incorrect or incomplete.  Ideally, each PMF would have
>     its own CWE as well.  But PMFs get into the whole complicated notion
>     of weakness chains (a la integer-overflow-leads-to-buffer-overflow),
>     and I'm not sure how we could best handle chains within CWE.
>     However, it's a good topic to explore.
>
>     At any rate, for example, "....//" might fall under this category,
>     since it's not valid to the OS as-is, but a collapse-into-unsafe-value
>     error (CWE-182) might reduce this string into "../" which then enables
>     the traversal to take place.
>
> As you can see, there are a few options.  The second approach, while more
> weakness-focused, still has overlapping categories.
>
> ===== So what should happen to the current low-level leaf nodes? =====
>
> If it's decided that these are too low-level, or the wrong perspective,
> then some options for handling them are covered at
> http://cwe.mitre.org/community/research/restructuring.html
>
> But the question still remains about what to do.
>
> Some CWE consumers might advocate that the existing low-level nodes should
> stay as they are, because there are some cases where this precision would
> be useful, such as in a pen-test or black-box scanning situation where you
> would want to ensure broad coverage, *or* if a programmer is implementing
> a protection mechanism and wants to figure out what potential problems
> could arise.  However, these nodes, as currently described, are inherently
> attack-focused.  In addition, you wind up with a risk of combinatorial
> explosion if you try to ensure that CWE's coverage is complete in this
> area.  Path traversal has been a favorite vulnerability of mine for years,
> so I've studied it pretty closely, and I suspect that even the existing
> set of leaf nodes is incomplete.
>
> Others would think that a middle ground might be more appropriate.  This
> makes sense for a few use-cases, including general education, as well as
> characterizing code analysis tool capabilities - they mostly only spot
> "pathname can be manipulated", so all (or most) leaf nodes would apply.
>
> Note that CWE contains a LOT of nodes that are like this.  XSS (CWE-79),
> Buffer Errors (CWE-119), and Information Leaks (CWE-200) are some other
> examples.
>
> ===== The larger questions =====
>
> The path traversal nodes illustrate some larger questions that we face for
> CWE:
>
> 1) When there are competing perspectives, which ones do we adopt?  The
>    use-cases we defined at
>    http://cwe.mitre.org/community/research/stakeholders.html, plus
>    feedback from the CWE research community (i.e. you), will help us to
>    decide this.
>
> 2) If we adopt multiple perspectives, then how do we modify CWE so that it
>    supports these perspectives, both internally within the XML, and
>    externally in terms of presenting them to CWE consumers without
>    confusing them?  Hopefully, views will be able to account for these,
>    but there's also the question of resources/labor and where we direct
>    our efforts (although community involvement might help here, too).
>
> 3) Do we aim for theoretical completeness at a low level for CWE, which
>    could translate into combinatorial explosions and perhaps tens of
>    thousands of CWEs, or do we live with incomplete nodes and the
>    associated biases that they would indirectly introduce into
>    quantitative comparisons?  Are there ways of handling the combinatorics
>    without producing an excessively large set of CWEs?  (Some other MITRE
>    initiatives such as CCE have their own ways of dealing with
>    combinatorics.)
>
>
> You can see some of these questions "in action" on the discussion points
> page on the CWE web site, at
> http://cwe.mitre.org/community/research/content_discussion.html
>
> - Steve

Re: Patr Traversal and some challenges for CWE

by Steven M. Christey-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, 17 Oct 2007, Pascal Meunier wrote:

> I've wondered why a regular expression or grammar couldn't be used
> sometimes in the CWE, to bring combinatorial explosions under control.
> Isn't it sufficient, at least for the "pen-test or black-box scanning
> situation" that the product fails when given one of the possible input
> instances, and that the testers know how to generate them automatically?
>
> It seems to me that it would also work just fine for "general education,
> as well as characterizing code analysis tool capabilities."

Internally, we've informally discussed "sub-nodes" as one possible
solution.  It's not clear what they would look like, but the basic idea
is: a CWE node only goes to one particular level.  Where greater levels of
details are desired, "sub-nodes" are created that get attached to the
higher-level CWE node.  These sub-nodes would *not* have their own IDs.
They would be well-structured in order to support automated processing.
But, since they would name the specific details, they would still be
useful in the contexts where that's needed.

Consider a resource-specific example, CWE-538 "File and Directory
Information Leaks," where it has several child nodes that only differ
based on which type of file contains the leak: 528 is for core dump files,
527 is for CVS repository, 532 is for log files, and 530 is for backup
files.  These are all about the same general resource - files - but only
vary depending on the *type* of file resource.

There are two things to consider here:

1) The list of children isn't complete.  For example, there's CVS but
   not SVN or other source control products; there's source code but not
   configuration file or data file;  there's backup files, but only ~.bk
   is mentioned - how about "~" (Emacs anyone?), .1/.2, .bak, etc.

2) Path traversal is about files.  So, if we decide that we're going to
   split information leak based on different types of files, then why not
   split each path traversal node based on different types of files?
   That would be 12*9 separate nodes (number of children of CWE-23
   times number of children of CWE-538), a total of 108 nodes, just for
   those two.

Now, if 538 were restructured to use "sub-nodes," then it would be at a
fixed level of abstraction, and the sub-node would list the different
types of files to consider.  This same list of file-types could be reused
in other CWE nodes that are specific to file resources.

- Steve

P.S.  For some other resource-specific examples in CWE, see
http://cwe.mitre.org/community/research/discussion/resspec.html

Re: Patr Traversal and some challenges for CWE

by Dave McKinney :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Oct 17, 2007 at 01:44:39PM -0400, Steven M. Christey wrote:

> > As a more general comment--have you given any thought to collapsing some
> > of these together?  There seem to be an awful lot of nuanced
> > distinctions.  For example, if you were to introduce the term "separator
> > character" which could be equal to '\' or '/' you could quickly
> > eliminate a number of leafs in this section.
>
> We've done a little bit of thinking on this.  The path traversal nodes are
> one part of the tree that seems excessively deep.
>
> Note that the current path traversal leaves, such as "..." and "....//",
> are all attack-focused.  They concentrate on specific manipulations that
> are nonetheless a little different from each other.

Supporting the attack-focused variants as individual nodes is where
the problem lies I think. I don't see this very often in other parts of the CWE,
where it is often adequate to generalize the issue and then
split-hairs at a higher level than what is being done with 'Absolute'
and 'Relative' Path Traversal.

These variants are either because the OS/filesystem/shell supports
different permutations of path traversal syntax or possibly because
there have been common instances in the wild where an application's
input validation routines have opened the door for new path traversal
variants (in an honest effort to prevent simpler variants). Couldn't
the common cases be described in the context notes without meriting
their own nodes?

Do the variants represent distinct manifestations of a larger problem
or are they just more ways to skin the same cat? I understand there is
a need for both pen-testers and application developers to be aware of
these variants but to me it seems inconsistent with how other CWE
entries have been generalized. For example, if you look at OS Command
Injection as a somewhat similar problem, it does not have child nodes
that reflect specific metacharacters, ie: a node for ;, a node for |,
etc. XSS also has similar concerns and the potential for variants is
far worse if CWE opens the floodgates... how about a node for each
example in RSnake's XSS Cheat Sheet? Nodes like "Script in IMG Tags"
already worry me. :(

I really think attacker input is too dynamic to really classify in
this way. Even with path traversals there are a lot of permutations
that are valid syntax and many that exploit shoddy input validation
routines. It would be hard to ensure that the child nodes provided
complete coverage of the variants. While applications should be
liberal in what they accept, I don't think the CWE should be. ;)

Pen-testers and application developers would probably be just as well
served by having some common examples in the Absolute/Relative Path Traversal
parent nodes via references and context nodes.

It also depends on how you frame the weakness. Is the failure to
filter '../' a different weakness than failing to filter '....//'?
It may be, depending on the implementation.

However, many implementations may be open to all of the variants due
to a complete lack of input validation of user-supplied paths/files --
and I'd personally argue against classifying that as X number of
weaknesses depending on the number of variants in the CWE dictionary.
I see it as one fundamental problem that could potentially be
addressed without specifically filtering every variant.



--
Dave McKinney
Symantec

keyID: E461AE4E
key fingerprint = F1FC 9073 09FA F0C7 500D  D7EB E985 FAF3 E461 AE4E

Re: Patr Traversal and some challenges for CWE

by Pascal Meunier-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I don't understand how you would use the sub-nodes, or how they could be
represented.  The closest concepts I can think of are an additional
table in a relational database for resources with rows like:

|resource type |   name   |
|"file type" | "core dump" |
|"file type" | "backup" |
|"file type" | "log" |

or an associative array:
"file type" => ["core dump", "backup", "log"]
"device" => ["tty0"]
etc...

In which case several CWE nodes would have a field that would point to
"file type"?  And by editing file types once you would be automatically
updating all the CWE entries that refer to those file types.  You could
even have several different lists of file types for different uses.
This would in effect "normalize" the CWE, to use database jargon.

That sounds all good and well if a little complex.  I think it's
overkill compared to just adding a field that would support syntax
testing where appropriate.

Nevertheless, I like the idea of avoiding a large number of CWE IDs that
vary on a single, simple detail that could be easily abstracted and
stored in a list.  The problem of missing CWE IDs is then reduced to
that of a missing element in the list, which I would think is much less
problematic for someone trying to identify a specific CWE ID relating to
an issue at hand.

Pascal


Steven M. Christey wrote:

> On Wed, 17 Oct 2007, Pascal Meunier wrote:
>
>> I've wondered why a regular expression or grammar couldn't be used
>> sometimes in the CWE, to bring combinatorial explosions under control.
>> Isn't it sufficient, at least for the "pen-test or black-box scanning
>> situation" that the product fails when given one of the possible input
>> instances, and that the testers know how to generate them automatically?
>>
>> It seems to me that it would also work just fine for "general education,
>> as well as characterizing code analysis tool capabilities."
>
> Internally, we've informally discussed "sub-nodes" as one possible
> solution.  It's not clear what they would look like, but the basic idea
> is: a CWE node only goes to one particular level.  Where greater levels of
> details are desired, "sub-nodes" are created that get attached to the
> higher-level CWE node.  These sub-nodes would *not* have their own IDs.
> They would be well-structured in order to support automated processing.
> But, since they would name the specific details, they would still be
> useful in the contexts where that's needed.
>
> Consider a resource-specific example, CWE-538 "File and Directory
> Information Leaks," where it has several child nodes that only differ
> based on which type of file contains the leak: 528 is for core dump files,
> 527 is for CVS repository, 532 is for log files, and 530 is for backup
> files.  These are all about the same general resource - files - but only
> vary depending on the *type* of file resource.
>
> There are two things to consider here:
>
> 1) The list of children isn't complete.  For example, there's CVS but
>    not SVN or other source control products; there's source code but not
>    configuration file or data file;  there's backup files, but only ~.bk
>    is mentioned - how about "~" (Emacs anyone?), .1/.2, .bak, etc.
>
> 2) Path traversal is about files.  So, if we decide that we're going to
>    split information leak based on different types of files, then why not
>    split each path traversal node based on different types of files?
>    That would be 12*9 separate nodes (number of children of CWE-23
>    times number of children of CWE-538), a total of 108 nodes, just for
>    those two.
>
> Now, if 538 were restructured to use "sub-nodes," then it would be at a
> fixed level of abstraction, and the sub-node would list the different
> types of files to consider.  This same list of file-types could be reused
> in other CWE nodes that are specific to file resources.
>
> - Steve
>
> P.S.  For some other resource-specific examples in CWE, see
> http://cwe.mitre.org/community/research/discussion/resspec.html

RE: Patr Traversal and some challenges for CWE

by Jarzombek, Joe :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Steve & Bob -

I, too, like the idea of avoiding a large number of CWE IDs that vary on
a single, simple detail that could be easily abstracted and stored.

Let's discuss the desired solution/selected implementation at our
upcoming Software Assurance Working Group session on Technology, Tools
and Product Evaluation, 4 Dec in Arlington, VA.  Since CWE provides the
foundation for our SwA Ecosystem, we would also provide a short
out-brief during the plenary session on 5 Dec.
 
V/r,

Joe
 
Joe Jarzombek, PMP
Director for Software Assurance
National Cyber Security Division
Office of Assistant Secretary
   for Cyber Security & Communications
Department of Homeland Security

e-mail:  Joe.Jarzombek@...
Cell Phone:  703 627-4644
Business Phone:  703 235-5126
Fax:  703 235-5961
http://www.us-cert.gov/swa/  "Build Security In"
https://buildsecurityin.us-cert.gov 
 

-----Original Message-----
From: owner-cwe-research-list@...
[mailto:owner-cwe-research-list@...] On Behalf Of Pascal
Meunier
Sent: Thursday, October 18, 2007 10:31 AM
To: Steven M. Christey
Cc: CWE-RESEARCH-LIST@...
Subject: Re: Patr Traversal and some challenges for CWE

I don't understand how you would use the sub-nodes, or how they could be
represented.  The closest concepts I can think of are an additional
table in a relational database for resources with rows like:

|resource type |   name   |
|"file type" | "core dump" |
|"file type" | "backup" |
|"file type" | "log" |

or an associative array:
"file type" => ["core dump", "backup", "log"]
"device" => ["tty0"]
etc...

In which case several CWE nodes would have a field that would point to
"file type"?  And by editing file types once you would be automatically
updating all the CWE entries that refer to those file types.  You could
even have several different lists of file types for different uses.
This would in effect "normalize" the CWE, to use database jargon.

That sounds all good and well if a little complex.  I think it's
overkill compared to just adding a field that would support syntax
testing where appropriate.

Nevertheless, I like the idea of avoiding a large number of CWE IDs that
vary on a single, simple detail that could be easily abstracted and
stored in a list.  The problem of missing CWE IDs is then reduced to
that of a missing element in the list, which I would think is much less
problematic for someone trying to identify a specific CWE ID relating to
an issue at hand.

Pascal


Steven M. Christey wrote:
> On Wed, 17 Oct 2007, Pascal Meunier wrote:
>
>> I've wondered why a regular expression or grammar couldn't be used
>> sometimes in the CWE, to bring combinatorial explosions under
control.
>> Isn't it sufficient, at least for the "pen-test or black-box scanning
>> situation" that the product fails when given one of the possible
input
>> instances, and that the testers know how to generate them
automatically?
>>
>> It seems to me that it would also work just fine for "general
education,
>> as well as characterizing code analysis tool capabilities."
>
> Internally, we've informally discussed "sub-nodes" as one possible
> solution.  It's not clear what they would look like, but the basic
idea
> is: a CWE node only goes to one particular level.  Where greater
levels of
> details are desired, "sub-nodes" are created that get attached to the
> higher-level CWE node.  These sub-nodes would *not* have their own
IDs.
> They would be well-structured in order to support automated
processing.
> But, since they would name the specific details, they would still be
> useful in the contexts where that's needed.
>
> Consider a resource-specific example, CWE-538 "File and Directory
> Information Leaks," where it has