Abstract and Concrete CPE Names in the Dictionary

26 messages Options
Embed this post
Permalink
1 2
Valeri, David [USA]

Abstract and Concrete CPE Names in the Dictionary

Reply Threaded More More options
Print post
Permalink
While I know that the use of Microsoft Windows as an example may drag some
baggage into this discussion, please focus on the questions/issues I am
raising instead of the use of Microsoft Windows as an example.  I chose it
purely because it is a product suite that I know well and that exhibits the
behavior in question within the dictionary.

In the current dictionary, a cpe-item is defined for the following Windows
XP variants as well as for SP2 variants.

cpe:/o:microsoft:windows_xp
cpe:/o:microsoft:windows_xp::gold
cpe:/o:microsoft:windows_xp::gold:embedded
cpe:/o:microsoft:windows_xp::gold:media_center
cpe:/o:microsoft:windows_xp::gold:professional
cpe:/o:microsoft:windows_xp::gold:tablet_pc
cpe:/o:microsoft:windows_xp::sp1:embedded
cpe:/o:microsoft:windows_xp::sp1:media_center
cpe:/o:microsoft:windows_xp::sp1:professional
cpe:/o:microsoft:windows_xp::sp1:tablet_pc

It seems to me that cpe:/o:microsoft:windows_xp and
cpe:/o:microsoft:windows_xp::gold are not tangible products that I can
install onto an asset.  The other variants can be installed onto an asset.
For this discussion I will refer to the first two CPE Names in the above
list as abstract and the later CPE Names as concrete.

I am under the impression that the abstract CPE Names are used to represent
hierarchical metadata such as OVAL definitions that apply to the other CPE
Names below it (I am envisioning a tree with the root node of
cpe:/o:microsoft:windows_xp, an internal node of
cpe:/o:microsoft:windows_xp::gold and leaf nodes of the concrete CPE Names
in the list above).

Now to the questions/issues:

1) Is my hierarchical interpretation correct?  The specification mentions
hierarchical on page 19, but doesn't discuss hierarchy in the dictionary.
("Matching helps define the relationship between different CPE Names (or
language statements) and follows the hierarchical relationship built into
the naming format.")

1.a) If 1 is correct, is there a requirement that a check associated to
cpe:/o:microsoft:windows_xp be applicable to the concrete CPE Names that
build upon it?

1.b) Are these checks required to be future-proof?  That is, if a new
edition, version, language, etc. of cpe:/o:microsoft:windows_xp is released,
are the checks associated to cpe:/o:microsoft:windows_xp required to detect
this new variant?  If not, are the checks updated or is the metadata in the
dictionary updated to remove this out-of-date reference to the check?

1.c) Are there other use cases that require abstract CPE names to be in the
dictionary?

2) BAH supports clients that are leveraging the CPE specification in order
to represent configuration information about computing assets.  In this use
case, only concrete CPE Names are of value as I cannot have an asset with
cpe:/o:microsoft:windows_xp::gold installed.  I can only have an asset with
a tangible product installed and these are represented by concrete CPE Names
only.  How can I parse the CPE dictionary to only extract concrete names?  I
think I can do it by parsing the list of cpe-items (with great difficulty).
I think I can do it by parsing NIST's metadata (with less difficulty), but,
I think that I only want the leaf nodes in the hierarchy.

2.a) If I do it by parsing for leaf nodes, is there ever the case where an
internal node represents a concrete CPE name?  For instance, assume that
Microsoft released an operating system with a single edition only, we will
call it cpe:/o:microsoft:windows_example::gold (again, ignore the use of a
Microsoft OS here and focus on the potential issue).  Now say shortly
thereafter a large group of countries decides that Microsoft needs to
decouple some parts of the OS so Microsoft releases
cpe:/o:microsoft:windows_example::gold:eu to comply with the ruling (don't
focus on cpe:/o:microsoft:windows_example::gold:eu being the right CPE Name
to represent this condition, focus on the fact that a new and unanticipated
variant has been released).  At this point, I see a potential problem,
cpe:/o:microsoft:windows_example::gold was originally the only variant of
the OS, but now cpe:/o:microsoft:windows_example::gold:eu could exist as
well.  At this point, cpe:/o:microsoft:windows_example::gold would need to
be deprecated, as a CPE logical expression matching
cpe:/o:microsoft:windows_example::gold would also match
cpe:/o:microsoft:windows_example::gold:eu.
cpe:/o:microsoft:windows_example::gold can no longer be uniquely identified
because of this new variant.  Has a situation such as this one occurred in
the dictionary and is a process in place to avoid such a situation?  While I
don't see this type of issue arising often with well-known or mature pieces
of software, I do see it occurring with the smaller and more rapidly
changing items in the dictionary.

3) Is Windows XP Home edition missing from the dictionary?  I just want to
make sure that cpe:/o:microsoft:windows_xp::gold is representative of an
abstract CPE Name and not of Windows XP Home.  If it does represent Windows
XP Home, then issue 2.a has already occurred.


David Valeri
Booz Allen Hamilton
8281 Greensboro Dr.
McLean, VA 22102
Tel: 703.377.5607
Fax: 703.902.3330



smime.p7s (5K) Download Attachment
Andrew Buttner

Re: Abstract and Concrete CPE Names in the Dictionary

Reply Threaded More More options
Print post
Permalink
>cpe:/o:microsoft:windows_xp

>cpe:/o:microsoft:windows_xp::gold
>cpe:/o:microsoft:windows_xp::gold:embedded
>cpe:/o:microsoft:windows_xp::gold:media_center
>cpe:/o:microsoft:windows_xp::gold:professional
>cpe:/o:microsoft:windows_xp::gold:tablet_pc
>cpe:/o:microsoft:windows_xp::sp1:embedded
>cpe:/o:microsoft:windows_xp::sp1:media_center
>cpe:/o:microsoft:windows_xp::sp1:professional
>cpe:/o:microsoft:windows_xp::sp1:tablet_pc
>
>It seems to me that cpe:/o:microsoft:windows_xp and
>cpe:/o:microsoft:windows_xp::gold are not tangible products that I can
>install onto an asset.  The other variants can be installed onto an
>asset.
I will try to clear up this confusion as best I can.  Please let me know if
I fail miserably :)

None of the names above represent a tangible product.  In fact a CPE is not
meant to accomplish this.  Rather, a CPE Name represents a "platform type".
For the first name in your example, this platform type would be - any system
that has Windows XP installed.  For the last example this would be - any
system that has Windows XP SP1 tablet edition installed.  Note that even
this last one doesn't identify a tangible product as there are different
language release, etc.

In summary, a CPE Name identifies a Platform Type, not a specific platform.



>1) Is my hierarchical interpretation correct?  The specification
>mentions
>hierarchical on page 19, but doesn't discuss hierarchy in the
>dictionary.
>("Matching helps define the relationship between different CPE Names (or
>language statements) and follows the hierarchical relationship built
>into
>the naming format.")
>
>1.a) If 1 is correct, is there a requirement that a check associated to
>cpe:/o:microsoft:windows_xp be applicable to the concrete CPE Names that
>build upon it?
Note that when dealing with CPE, I understand a "check" to be a test used to
determine if a given system can be grouped under a specific CPE Name.  In
this case, a system that returns true for the cpe:/o:microsoft:windows_xp
check does not necessarily return true for the
cpe:/o:microsoft:windows_xp::sp1 check.

But a system that returns true for cpe:/o:microsoft:windows_xp::sp1 would
return true for cpe:/o:microsoft:windows_xp.



>1.b) Are these checks required to be future-proof?  That is, if a new
>edition, version, language, etc. of cpe:/o:microsoft:windows_xp is
>released, are the checks associated to cpe:/o:microsoft:windows_xp
>required to detect this new variant?

Ideally yes, although in reality a check may have to be updated.  Basically,
we are looking for a check that answers the question: "is this system part
of the windows xp platform type?"  This is not always able to be done in a
future-proof way.




>If not, are the checks updated or is the metadata in the
>dictionary updated to remove this out-of-date reference to the check?

The check should be updated





>2) BAH supports clients that are leveraging the CPE specification in
>order to represent configuration information about computing assets.
>In this use case, only concrete CPE Names are of value as I cannot
>have an asset with cpe:/o:microsoft:windows_xp::gold installed.  I
>can only have an asset with a tangible product installed and these
>are represented by concrete CPE Names only.  How can I parse the CPE
>dictionary to only extract concrete names?  I think I can do it by
>parsing the list of cpe-items (with great difficulty). I think I can
>do it by parsing NIST's metadata (with less difficulty), but, I
>think that I only want the leaf nodes in the hierarchy.
I think what you really mean here is that you are only interested in CPE
Names that use certain components.  A simple regular expression or Xpath
statement might be able to do the trick here.



>2.a) If I do it by parsing for leaf nodes, is there ever the case where
>an internal node represents a concrete CPE name?  For instance, assume
>that Microsoft released an operating system with a single edition only,
>we will call it cpe:/o:microsoft:windows_example::gold (again, ignore
>the use of a Microsoft OS here and focus on the potential issue).  Now
>say shortly thereafter a large group of countries decides that
>Microsoft needs to decouple some parts of the OS so Microsoft releases
>cpe:/o:microsoft:windows_example::gold:eu to comply with the ruling
>(don't focus on cpe:/o:microsoft:windows_example::gold:eu being the
>right CPE Name to represent this condition, focus on the fact that a
>new and unanticipated variant has been released).  At this point, I see
>a potential problem, cpe:/o:microsoft:windows_example::gold was
>originally the only variant of the OS, but now
>cpe:/o:microsoft:windows_example::gold:eu could exist as well.  At this
>point, cpe:/o:microsoft:windows_example::gold would need to
>be deprecated, as a CPE logical expression matching
>cpe:/o:microsoft:windows_example::gold would also match
>cpe:/o:microsoft:windows_example::gold:eu.
>cpe:/o:microsoft:windows_example::gold can no longer be uniquely
>identified because of this new variant.  Has a situation such as this
>one occurred in the dictionary and is a process in place to avoid such
>a situation? While I don't see this type of issue arising often with
>well-known or mature pieces of software, I do see it occurring with the
>smaller and more rapidly changing items in the dictionary.
I think this is a good example of why we can't look at any CPE Name as a
concrete name, but rather we need to look at them as representing platform
types.  In your example, after the decoupling,
cpe:/o:microsoft:windows_example::gold would represent the platform type
"any platform with Windows Example Gold installed".  This would include any
specific edition including 'eu'.  The name
cpe:/o:microsoft:windows_example::gold:eu would refer to the platform type
"any platform with Windows Example Gold EU edition installed".

No change would be needed for the existing CPE Names to work with the new
product structure.



>3) Is Windows XP Home edition missing from the dictionary?  I just want
>to make sure that cpe:/o:microsoft:windows_xp::gold is representative of an
>abstract CPE Name and not of Windows XP Home.  If it does represent
>Windows XP Home, then issue 2.a has already occurred.

My guess is it is missing.  cpe:/o:microsoft:windows_xp::gold represents any
Windows XP Gold platform, including Home, Professional, Media Center, etc.



Again, I hope this helped.  Please let us know if there are further
questions.

Thanks
Drew




smime.p7s (4K) Download Attachment
Harold Booth-2

Re: Abstract and Concrete CPE Names in the Dictionary

Reply Threaded More More options
Print post
Permalink
Drew,

   I am hoping to get a clarification on the following statements:

Quoting "Buttner, Drew" <[hidden email]>:

> None of the names above represent a tangible product.  In fact a CPE is not
> meant to accomplish this.  Rather, a CPE Name represents a "platform type".
> For the first name in your example, this platform type would be - any system
> that has Windows XP installed.  For the last example this would be - any
> system that has Windows XP SP1 tablet edition installed.  Note that even
> this last one doesn't identify a tangible product as there are different
> language release, etc.
>
> In summary, a CPE Name identifies a Platform Type, not a specific platform.
>

In CPE how would you distinguish between a product reference and a platform
type?  A product reference would be a leaf node in the CPE tree hierarchy.  How
would you expect an asset database to use CPE where it needs to refer to
specific products?  Or how would two asset databases communicate with each other
using CPE when they need to refer to specific products? (Which I believe to be a
paraphrase of a stated use case.)
Andrew Buttner

Re: Abstract and Concrete CPE Names in the Dictionary

Reply Threaded More More options
Print post
Permalink
>In CPE how would you distinguish between a product reference and a
>platform type?

CPE doesn't try to be a specific product reference.  When creating the
list of components, we did not try to create a full list of stuff
needed to uniquely id specific products.  Rather we tried to establish
a list of components that are relatively common across different
platforms and would help us create unique identifiers for the level of
specificity desired.  In this case we decided to go down to the
language level.


>A product reference would be a leaf node in the CPE
>tree hierarchy.  How would you expect an asset database to use CPE
>where it needs to refer to specific products?

My guess is that when an asset database needs to refer to a product,
that it means product type the way CPE thinks about it.  In other
words, it needs to know how many systems have some version of Windows
XP, or how many systems have Windows XP SP1, or how many systems have
Windows XP SP1 Professional English.  All of these are platform types.


>Or how would two
>asset databases communicate with each other using CPE when they
>need to refer to specific products? (Which I believe to be a
>paraphrase of a stated use case.)

I think we are using "product" and "product type" the same here.  The
thing to make sure we aren't confusing is the term "system
identification" and "product type".


Thanks
Drew
Valeri, David [USA]

Re: Abstract and Concrete CPE Names in the Dictionary

Reply Threaded More More options
Print post
Permalink
Drew,

In your last response you said:

"My guess is that when an asset database needs to refer to a product, that
it means product type the way CPE thinks about it.  In other words, it needs
to know how many systems have some version of Windows XP, or how many
systems have Windows XP SP1, or how many systems have Windows XP SP1
Professional English.  All of these are platform types."

These are all valid use cases for an analyst looking into the repository.
Our analysts and systems very much wish to look into the repository and find
assets that have specific traits such as software, OS, and hardware
configurations.  The CPE Logical Expression gives these analysts and systems
the language that they need to construct these queries.  However, the
maintainers of the asset repositories are looking at CPE from the other side
of the system.  They want to assign software, operating systems, and
hardware to an asset.  That is, assemble a collection of CPE names that
represent, exactly, what comprises each asset in their repository.  These
maintainers include government employees and contractors hand-jamming
information into the repository and commercial vendors supporting automated
scanning tools that report system configurations to the asset repository.
These stakeholders, in this scenario, are looking for the authoritative list
of software, hardware, and operating systems that could actually be
installed on their assets, not for a broad family of products.  For example,
knowing that Windows XP is installed on an asset does not tell an analyst if
the asset is susceptible to a vulnerability that affect Windows XP SP2
Professional only.  Similarly, the presence of the issue that I described in
2.a of my original email also precludes the realization of these
stakeholders' use cases.

The use cases in sections 2.1 and 2.3 (in conjunction with the Logical
Expression) of the specification describe part of the use case our clients
are trying to realize; however, it seems unlikely that their use case can be
realized to the desired level of accuracy without unique identifiers for the
software, hardware, and operating systems that could actually be installed
on their assets.  In the end, that brings me back to my original need:  a
definitive list of unique IDs for software, hardware, and operating systems
that can actually be installed on or comprise the contents of an asset.


David Valeri
Booz Allen Hamilton
8281 Greensboro Dr.
McLean, VA 22102
Tel: 703.377.5607
Fax: 703.902.3330


-----Original Message-----
From: Buttner, Drew [mailto:[hidden email]]
Sent: Monday, June 09, 2008 1:03 PM
To: [hidden email]
Subject: Re: [CPE-DISCUSSION-LIST] Abstract and Concrete CPE Names in the
Dictionary

>In CPE how would you distinguish between a product reference and a
>platform type?

CPE doesn't try to be a specific product reference.  When creating the list
of components, we did not try to create a full list of stuff needed to
uniquely id specific products.  Rather we tried to establish a list of
components that are relatively common across different platforms and would
help us create unique identifiers for the level of specificity desired.  In
this case we decided to go down to the language level.


>A product reference would be a leaf node in the CPE tree hierarchy.  
>How would you expect an asset database to use CPE where it needs to
>refer to specific products?

My guess is that when an asset database needs to refer to a product, that it
means product type the way CPE thinks about it.  In other words, it needs to
know how many systems have some version of Windows XP, or how many systems
have Windows XP SP1, or how many systems have Windows XP SP1 Professional
English.  All of these are platform types.


>Or how would two
>asset databases communicate with each other using CPE when they need to
>refer to specific products? (Which I believe to be a paraphrase of a
>stated use case.)

I think we are using "product" and "product type" the same here.  The thing
to make sure we aren't confusing is the term "system identification" and
"product type".


Thanks
Drew


smime.p7s (5K) Download Attachment
Andrew Buttner

Re: Abstract and Concrete CPE Names in the Dictionary

Reply Threaded More More options
Print post
Permalink
>However, the maintainers of the asset repositories are looking at
>CPE from the other side of the system.  They want to assign software,
>operating systems, and hardware to an asset.  That is, assemble a
>collection of CPE names that represent, exactly, what comprises each
>asset in their repository.

CPE should work great for this.  Each CPE Name that assigned to an asset
identifies a platform type that the asset belongs to.  For example, if they
want to tag an asset as having an OS related to Windows XP then they can use
the CPE Name cpe:/o:microsoft:windows_xp.  If they want to express that an
asset has an OS related to Windows XP SP1 Embedded Edition then they can use
the CPE Name cpe:/o:microsoft:windows_xp::sp1:embedded.

If they are looking for a way to express exact system details, then they
should look at a language like OVAL System Characteristic.  CPE is not
designed to express these details.



>These stakeholders, in this scenario, are looking for the authoritative
>list of software, hardware, and operating systems that could actually
>be installed on their assets, not for a broad family of products.

CPE fits into this by providing an identifier for the platform types once
the list has been created.  It does not try to encode the information
necessary to answer the question about whether an application or OS can be
installed on a system.  For that you should look at a language like OVAL
Definitions.



>In the end, that brings me back to my original need:
>a definitive list of unique IDs for software, hardware, and operating
>systems that can actually be installed on or comprise the contents of
>an asset.

CPE is focused on making sure those unique IDs exists.  Unfortunately, CPE
does not try to determine what the list of IDs should be.  To solve your
need, someone must create a mapping that relates different CPE Names based
on applicability and ability to install.  This mapping might look like:

Host OS                          Application
---------------------------------------------------
cpe:/o:microsoft:windows_xp      cpe:/a:vendor:app1
cpe:/o:microsoft:windows_xp      cpe:/a:vendor:app3
cpe:/o:microsoft:windows_xp::sp1 cpe:/a:vendor:app2
cpe:/o:microsoft:windows_2003    cpe:/a:vendor:app3


Application                      Host OS
---------------------------------------------------
cpe:/a:vendor:app1               cpe:/o:microsoft:windows_xp
cpe:/a:vendor:app1               cpe:/o:microsoft:windows_2003


Does this help answer your question?

Thanks
Drew


smime.p7s (4K) Download Attachment
Valeri, David [USA]

Re: Abstract and Concrete CPE Names in the Dictionary

Reply Threaded More More options
Print post
Permalink
Drew,

Thanks for the reply.  I think the language I chose may have misrepresented
my point.  I'll try to clarify below.

The stakeholders in the scenarios I gave do not want a list of which
software can be installed on which operating systems and which operating
systems can be installed on which types of hardware, nor are they
immediately interested in the level of detail described in the OVAL System
Characteristics schema.  Currently, they are interested in the exact details
of installed software, operating systems, and to a lesser degree the
hardware type that the software and operating systems are installed on.

In your reply, you stated:
"For example, if they want to tag an asset as having an OS related to
Windows XP then they can use the CPE Name cpe:/o:microsoft:windows_xp.  If
they want to express that an asset has an OS related to Windows XP SP1
Embedded Edition then they can use the CPE Name
cpe:/o:microsoft:windows_xp::sp1:embedded"

The example you provide is exactly what the stakeholders wish to do;
however, they do not want to express a "related to" relationship.  They wish
to express a definitive "has this" or "is this" relationship to the finest
degree possible in CPE.  To the language level is preferred; however, I
believe to the edition level is sufficient for most use cases.  I am
doubtful that there are a large number of bugs related to language packs,
but I may be wrong.

To represent a "has this" or "is this" relationship to the desired level of
specificity, a CPE assigned to an asset must be unique, of the finest
granularity possible, and must not be susceptible to the scenario I laid out
in question 2.a of my original email.

Currently, my stakeholders have a need to extract the entries from the
dictionary that represent products or hardware that an asset can have or be.
There is no need from my stakeholders to extract CPEs that represent
"relates to" or "in the family of" information.  Furthermore, my
stakeholders have a need for a logical language that can be used to
construct queries to identify assets that have certain configurations.
These queries may be very general and leverage the wildcard features of the
matching algorithm.  These queries may also be very specific and target a
single language or edition of a product.  In the later case, the matching
algorithm and library need to be able to return only assets that have the
exact product installed (which is where my concern for issue 2.a comes in).

CPE seems to come very close to offering these capabilities; however, I'm
concerned that these use cases are not supported in the current dictionary
data or governance process.

I hope this email clears up any confusion and can set the stage for a more
productive discussion.


David Valeri
Booz Allen Hamilton
8281 Greensboro Dr.
McLean, VA 22102
Tel: 703.377.5607
Fax: 703.902.3330


-----Original Message-----
From: Buttner, Drew [mailto:[hidden email]]
Sent: Tuesday, June 10, 2008 11:01 AM
To: [hidden email]
Subject: Re: [CPE-DISCUSSION-LIST] Abstract and Concrete CPE Names in the
Dictionary

>However, the maintainers of the asset repositories are looking at CPE
>from the other side of the system.  They want to assign software,
>operating systems, and hardware to an asset.  That is, assemble a
>collection of CPE names that represent, exactly, what comprises each
>asset in their repository.

CPE should work great for this.  Each CPE Name that assigned to an asset
identifies a platform type that the asset belongs to.  For example, if they
want to tag an asset as having an OS related to Windows XP then they can use
the CPE Name cpe:/o:microsoft:windows_xp.  If they want to express that an
asset has an OS related to Windows XP SP1 Embedded Edition then they can use
the CPE Name cpe:/o:microsoft:windows_xp::sp1:embedded.

If they are looking for a way to express exact system details, then they
should look at a language like OVAL System Characteristic.  CPE is not
designed to express these details.



>These stakeholders, in this scenario, are looking for the authoritative
>list of software, hardware, and operating systems that could actually
>be installed on their assets, not for a broad family of products.

CPE fits into this by providing an identifier for the platform types once
the list has been created.  It does not try to encode the information
necessary to answer the question about whether an application or OS can be
installed on a system.  For that you should look at a language like OVAL
Definitions.



>In the end, that brings me back to my original need:
>a definitive list of unique IDs for software, hardware, and operating
>systems that can actually be installed on or comprise the contents of
>an asset.

CPE is focused on making sure those unique IDs exists.  Unfortunately, CPE
does not try to determine what the list of IDs should be.  To solve your
need, someone must create a mapping that relates different CPE Names based
on applicability and ability to install.  This mapping might look like:

Host OS                          Application
---------------------------------------------------
cpe:/o:microsoft:windows_xp      cpe:/a:vendor:app1
cpe:/o:microsoft:windows_xp      cpe:/a:vendor:app3
cpe:/o:microsoft:windows_xp::sp1 cpe:/a:vendor:app2
cpe:/o:microsoft:windows_2003    cpe:/a:vendor:app3


Application                      Host OS
---------------------------------------------------
cpe:/a:vendor:app1               cpe:/o:microsoft:windows_xp
cpe:/a:vendor:app1               cpe:/o:microsoft:windows_2003


Does this help answer your question?

Thanks
Drew


smime.p7s (5K) Download Attachment
Andrew Buttner

Re: Abstract and Concrete CPE Names in the Dictionary

Reply Threaded More More options
Print post
Permalink
>They wish to express a definitive "has this" or "is this" relationship
>to the finest degree possible in CPE.  To the language level is
>preferred; however, I believe to the edition level is sufficient for
>most use cases.

This relationship is something that they need to define anyway (CPE doesn't
try to define it) so it is perfectly acceptable for them to say that the CPE
Name that has been assigned to the asset means "the asset is of this
platform type".



>To represent a "has this" or "is this" relationship to the desired level
>of specificity, a CPE assigned to an asset must be unique,

All CPE Names must be unique by definition.



>of the finest granularity possible

I would reword this to say "must be at the granularity they desire".  You
even stated that going down to language is not desired.



>and must not be susceptible to the scenario I laid
>out in question 2.a of my original email.

Agreed and I think this is covered.



>Currently, my stakeholders have a need to extract the entries from the
>dictionary that represent products or hardware that an asset can have or
>be.
>There is no need from my stakeholders to extract CPEs that represent
>"relates to" or "in the family of" information.

I think this might be the root of the confusion.  CPE does nothing to try
and support this.  All that CPE is trying to do is provide a list of all the
known identifiers.  The amount of metadata provided is very sparse.  Just
enough to know what it is that has been identified.  I think what you need
is additional metadata associated with a CPE that would enable you to search
through the list and find the CPE Names that are desired.

I think this is a very important use case but is one that is outside the
scope of CPE.  This problem is very similar to one that faces the CVE
community.  CVE is a list of vulnerability identifiers.  Users that need
more metadata rely on an external databases to get it.  These external
databases (NVD is an example) use the CVE identifier to tag each entry and
associate metadata to it.

I think what you need is for the "National Product Database" to be created.
Agree?  This is something that had come up at CPE Developer Days as well.



>Furthermore, my
>stakeholders have a need for a logical language that can be used to
>construct queries to identify assets that have certain configurations.
>These queries may be very general and leverage the wildcard features of
>the matching algorithm.  These queries may also be very specific and target
>a single language or edition of a product.  In the later case, the
>matching algorithm and library need to be able to return only assets that
have
>the exact product installed (which is where my concern for issue 2.a comes
>in).

This seems to align with the goals of OVAL.


Thanks
Drew


smime.p7s (4K) Download Attachment
Waltermire, David

Re: Abstract and Concrete CPE Names in the Dictionary

Reply Threaded More More options
Print post
Permalink
Drew,

From what I have been able to determine from Dave's emails, his use case is
based on the need to create database records that represent authoritative,
discrete product references in the CPE name format.  I would define a
discrete product reference as a CPE name that refers to a SKU or
electronically distributed content.  CPE names used in this fashion are not
generated by reference or by report, thus he is forced to look to the
official CPE dictionary.  Since this is a database application he is unable
to use the system inventory approach using OVAL definitions to qualify what
CPE names to use.  Instead he is looking for another authoritative hint.

Furthermore, it is not enough that all CPE names are unique; he is looking
for the set of CPE names that correspond directly on a one-to-one basis with
actual installed software.  Said a different way he needs the ability for a
tool to be able to associate a discrete product with a corresponding CPE
name.  This mapping must be able to occur at differing levels of abstraction
relative to the CPE components.  I believe for most products that we have
the granularity in the CPE name to accomplish this so why not do this?

These capabilities are key for asset, license and procurement management
applications.  These use cases are definitely within the intended scope of
CPE, as references to standard product names are needed.  If all he needs to
support these use-cases is a flag that indicates if a CPE name refers to a
discrete product or not, I think we need to find a way to support this in an
official CPE capacity.  If the CPE standard does not support these
use-cases, we risk alienating users/vendors.  The worst case scenario for
this situation is the creation of a competing standard.  This is not a win
for CPE.  We need to find a way forward to make this work.

Dave


-----Original Message-----
From: Buttner, Drew [mailto:[hidden email]]
Sent: Tuesday, June 10, 2008 1:45 PM
To: [hidden email]
Subject: Re: [CPE-DISCUSSION-LIST] Abstract and Concrete CPE Names in the
Dictionary

>They wish to express a definitive "has this" or "is this" relationship
>to the finest degree possible in CPE.  To the language level is
>preferred; however, I believe to the edition level is sufficient for
>most use cases.

This relationship is something that they need to define anyway (CPE doesn't
try to define it) so it is perfectly acceptable for them to say that the CPE
Name that has been assigned to the asset means "the asset is of this
platform type".



>To represent a "has this" or "is this" relationship to the desired level
>of specificity, a CPE assigned to an asset must be unique,

All CPE Names must be unique by definition.



>of the finest granularity possible

I would reword this to say "must be at the granularity they desire".  You
even stated that going down to language is not desired.



>and must not be susceptible to the scenario I laid
>out in question 2.a of my original email.

Agreed and I think this is covered.



>Currently, my stakeholders have a need to extract the entries from the
>dictionary that represent products or hardware that an asset can have or
>be.
>There is no need from my stakeholders to extract CPEs that represent
>"relates to" or "in the family of" information.

I think this might be the root of the confusion.  CPE does nothing to try
and support this.  All that CPE is trying to do is provide a list of all the
known identifiers.  The amount of metadata provided is very sparse.  Just
enough to know what it is that has been identified.  I think what you need
is additional metadata associated with a CPE that would enable you to search
through the list and find the CPE Names that are desired.

I think this is a very important use case but is one that is outside the
scope of CPE.  This problem is very similar to one that faces the CVE
community.  CVE is a list of vulnerability identifiers.  Users that need
more metadata rely on an external databases to get it.  These external
databases (NVD is an example) use the CVE identifier to tag each entry and
associate metadata to it.

I think what you need is for the "National Product Database" to be created.
Agree?  This is something that had come up at CPE Developer Days as well.



>Furthermore, my
>stakeholders have a need for a logical language that can be used to
>construct queries to identify assets that have certain configurations.
>These queries may be very general and leverage the wildcard features of
>the matching algorithm.  These queries may also be very specific and target
>a single language or edition of a product.  In the later case, the
>matching algorithm and library need to be able to return only assets that
have
>the exact product installed (which is where my concern for issue 2.a comes
>in).

This seems to align with the goals of OVAL.


Thanks
Drew
Ernest Park-2

Re: Abstract and Concrete CPE Names in the Dictionary

Reply Threaded More More options
Print post
Permalink
Hi -
 
I have 4 databases, all which talk to each other indirectly through common CPE language.
 
 
I can do the following . . .
 
select vendor,application,part_v2,release,inventory_db.application_installed_date,master_db.platform,external_nvd.nvd_name,current
from open_source_db
 
inner join inventory_db on (inventory_db.application= open_source_db.application)
inner join external_nvd on (external_nvd.application=open_source_db.application)
inner join master_db on (master_db.application=open_source_db.application)
 
where inventory_db.application_installed_status = 'YES'
and
where inventory_db.application_currentreleasename is 'NOT NULL'
into outfile 'CPE_v2-inventory list.csv'
 
 
Excuse any typos above - I wrote a simple query to illustrate the point.
 
Using common CPE syntax in all my databases, I can reliably exchange data between databases, and output them with an identifier that will conform to having CPE pieces.
 
In my output, the URI can be parsed from the output, even in Excel, such as . . .
 
=CONCATENATE("cpe",A4,B4,":",C4,":",D4,"   =",E4,"=",F4,"=",G4) 
 
Fields in < > refer to Excel cells.
 
 
=concatenate("cpe",<part>,<vendor>,<application>,<release>,<installed_date>,<platform>,<cve_names>)

 

 

part vendor application release installed_date platform cve_names
:/a: erniesoft killer_app 1.1 4022008 linux cve-2008-7021

 

 

cpe:/a:erniesoft:killer_app:1.1   =4022008=linux=cve-2008-7021
 
Note that platform and install date are data fields that I supply and in this case are not constrained to anything. Platform could be "External approved WIndows - v12", and so on.
 
 
Therefore,
 
I can output an inventory report down to the installed release, identify when it was installed, what platform, and the latest relevant CVE name. I do a report for subscribers that includes CVEs associated to currently installed releases and sorts results by CVSS, Work load index and my risk score. I scan 5 databases and the NVD XML to output the inventory risk management report. The CPE gives me a uniform handle by which to describe things across datasources.
 
 
So - NVD gives me CVE names, CVSS, WLI, I generate an inventory report using scanning tools, I apply poicy using policy management software, and so on.
 
If a user of my data wants to pull OSS license information, or all known releases, or patch status, etc, they can do so with a query built around CPE constructs. One of these days, a web service may front end the query above.
 
 
 
 
 
Ernie
 
 
 
 
 
On Tue, Jun 10, 2008 at 4:13 PM, David Waltermire <[hidden email]> wrote:
Drew,

From what I have been able to determine from Dave's emails, his use case is
based on the need to create database records that represent authoritative,
discrete product references in the CPE name format.  I would define a
discrete product reference as a CPE name that refers to a SKU or
electronically distributed content.  CPE names used in this fashion are not
generated by reference or by report, thus he is forced to look to the
official CPE dictionary.  Since this is a database application he is unable
to use the system inventory approach using OVAL definitions to qualify what
CPE names to use.  Instead he is looking for another authoritative hint.

Furthermore, it is not enough that all CPE names are unique; he is looking
for the set of CPE names that correspond directly on a one-to-one basis with
actual installed software.  Said a different way he needs the ability for a
tool to be able to associate a discrete product with a corresponding CPE
name.  This mapping must be able to occur at differing levels of abstraction
relative to the CPE components.  I believe for most products that we have
the granularity in the CPE name to accomplish this so why not do this?

These capabilities are key for asset, license and procurement management
applications.  These use cases are definitely within the intended scope of
CPE, as references to standard product names are needed.  If all he needs to
support these use-cases is a flag that indicates if a CPE name refers to a
discrete product or not, I think we need to find a way to support this in an
official CPE capacity.  If the CPE standard does not support these
use-cases, we risk alienating users/vendors.  The worst case scenario for
this situation is the creation of a competing standard.  This is not a win
for CPE.  We need to find a way forward to make this work.

Dave


-----Original Message-----
From: Buttner, Drew [mailto:[hidden email]]
Sent: Tuesday, June 10, 2008 1:45 PM
To: [hidden email]
Subject: Re: [CPE-DISCUSSION-LIST] Abstract and Concrete CPE Names in the
Dictionary

>They wish to express a definitive "has this" or "is this" relationship
>to the finest degree possible in CPE.  To the language level is
>preferred; however, I believe to the edition level is sufficient for
>most use cases.

This relationship is something that they need to define anyway (CPE doesn't
try to define it) so it is perfectly acceptable for them to say that the CPE
Name that has been assigned to the asset means "the asset is of this
platform type".



>To represent a "has this" or "is this" relationship to the desired level
>of specificity, a CPE assigned to an asset must be unique,

All CPE Names must be unique by definition.



>of the finest granularity possible

I would reword this to say "must be at the granularity they desire".  You
even stated that going down to language is not desired.



>and must not be susceptible to the scenario I laid
>out in question 2.a of my original email.

Agreed and I think this is covered.



>Currently, my stakeholders have a need to extract the entries from the
>dictionary that represent products or hardware that an asset can have or
>be.
>There is no need from my stakeholders to extract CPEs that represent
>"relates to" or "in the family of" information.

I think this might be the root of the confusion.  CPE does nothing to try
and support this.  All that CPE is trying to do is provide a list of all the
known identifiers.  The amount of metadata provided is very sparse.  Just
enough to know what it is that has been identified.  I think what you need
is additional metadata associated with a CPE that would enable you to search
through the list and find the CPE Names that are desired.

I think this is a very important use case but is one that is outside the
scope of CPE.  This problem is very similar to one that faces the CVE
community.  CVE is a list of vulnerability identifiers.  Users that need
more metadata rely on an external databases to get it.  These external
databases (NVD is an example) use the CVE identifier to tag each entry and
associate metadata to it.

I think what you need is for the "National Product Database" to be created.
Agree?  This is something that had come up at CPE Developer Days as well.



>Furthermore, my
>stakeholders have a need for a logical language that can be used to
>construct queries to identify assets that have certain configurations.
>These queries may be very general and leverage the wildcard features of
>the matching algorithm.  These queries may also be very specific and target
>a single language or edition of a product.  In the later case, the
>matching algorithm and library need to be able to return only assets that
have
>the exact product installed (which is where my concern for issue 2.a comes
>in).

This seems to align with the goals of OVAL.


Thanks
Drew

Andrew Buttner

Re: Abstract and Concrete CPE Names in the Dictionary

Reply Threaded More More options
Print post
Permalink
In reply to this post by Waltermire, David
>These capabilities are key for asset, license and procurement
management
>applications.

I completely agree, and hopefully CPE enables these types of
capabilities, but is it best for CPE to try and provide full support?
Or is it better for CPE work with other efforts to provide full
support?


>These use cases are definitely within the intended scope of
>CPE, as references to standard product names are needed.  If all he
>needs to support these use-cases is a flag that indicates if a CPE
>name refers to a discrete product or not, I think we need to find
>a way to support this in an official CPE capacity.

I very much want to hear from others in the community about the above
statement.  Is this within the scope of CPE?  Should CPE work to define
additional metadata related to the identifier?  Or should CPE leave
this metadata work to others and focus solely on building the list of
identifiers?

A couple of points that I would like to bring to the table for this ...

* What is discrete to one user might not be discrete to another.  How
would CPE make this determination?  Is WinXP Pro discrete (I think that
is what you buy) or is WinXP Pro SP1 discrete (after you download the
SP)?  Actually, in reality you buy WinXP Pro English so is that
discrete?  This is similar to the issue of providing weights to
vulnerabilities.  Everyone has a different answer for what the weight
should be.  If the enumeration tries to set one as official, those that
don't agree will be alienated.

* Does expanding the scope of CPE beyond the task of providing
identifiers reduce our ability to succeed?  This is a lesson that has
been learned in the past.  CVE was one of the first to just focus on
the identifier and leave the metadata problem to others.  This resulted
in an enumeration that has by all accounts succeeded.  Other efforts
have tried to do too much and have failed (recently look at AVDL).  CPE
has started down the CVE path and focused on the identifier, my
personal suggestion is to stay on course.

* This type of metadata is perfect for an external data repository
built on CPE.  This allows CPE to focus on the identifiers, and the
repository to focus on supplying the use-case specific metadata.  These
repositories would not be competing with CPE, but rather leveraging
CPE.  CPE would allow these repositories to share information, and
users to pull information from each of the repositories.


I think this discussion is extremely useful for this community and I
urge everyone to weigh-in with their thoughts.  I personally think that
the issue of where metadata lives is at the root of many of the
concerns that get brought up.  Should CPE focus on just the identifier?
Or should it also focus on providing a repository of metadata related
to the identifier?

Thanks
Drew
Kevin Sitto

Re: Abstract and Concrete CPE Names in the Dictionary

Reply Threaded More More options
Print post
Permalink
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Hi,

Dave's summary did an excellent job of summarizing what appears to be a core risk with the current implementation of CPE - it's very difficult to relate actual items present on an asset to distinct CPE identifiers.

We are currently working with some of the same requirements David Valeri is; we have a set of customers who need to be able to express an inventory of the software/hardware present on their assets.  What's more, they would also like multiple systems to be able to compare notes regarding the inventory they identify while also having some sort of agreed upon mechanism for sharing that inventory with other organizations.

CPE would appear to be a natural fit for fulfilling such requirements.  However, sharing that concrete inventory information using CPE is entirely dependant on standardizing the references to those concrete inventory items.  

If CPE merely identifies the abstract notion of "Microsoft, Windows XP, Gold" and relies of metadata existing elsewhere to fill the "gap" between that identifier and the actual application ("Microsoft, Windows XP, Professional, SP2, x64, en", we risk losing the ability to guarantee that we are all using the same syntax to identify the same software - something I had always perceived to be the core use case of CPE.

Attempting to build actual CPE identifies for concrete inventory items brings along its own set of complexities.  The one I've had the most difficulty working around is that software does not fit naturally within the notion of a hierarchy.  Rather, an inventory item is most naturally represented as the intersection between multiple pieces of metadata.

For instance, following the Windows XP Example from above, how would one build the CPE entry for "Microsoft Windows XP Professional SP2 x64 en"?

Would it be "cpe:/o:microsoft:windows_xp:professional:sp2:x64:en"?
Or "cpe:/o:microsoft:windows_xp:professional:sp2:en:x64"?

In this case, the operating system installed is the intersection between Product ("Microsoft Windows XP Professional"), Version ("SP2"), Architecture ("x64") and Language ("en").  Any one of these could reasonably fit at any place within the hierarchy.

Fortunately, as has been identified previously in the thread, getting from here to there is immediately possible without any sort of major shift in the way CPE is currently defined.  Following the same hierarchical model and leveraging much of the existing content, it's just a matter of making a specific effort to add content which refers to those concrete items (ie: We as a community all agree to use "cpe:/o:microsoft:windows_xp:professional:sp2:x64:en" and put that in the dictionary).  It will require some magic on the back end to support different types of aggregation (ie: "list all assets with x64 operating systems"), but that's what app developers do well.

Thanks,
Kevin

- -----Original Message-----
From: Buttner, Drew [mailto:[hidden email]]
Sent: Wednesday, June 11, 2008 8:35 AM
To: [hidden email]
Subject: Re: [CPE-DISCUSSION-LIST] Abstract and Concrete CPE Names in the Dictionary

>These capabilities are key for asset, license and procurement
management
>applications.

I completely agree, and hopefully CPE enables these types of
capabilities, but is it best for CPE to try and provide full support?
Or is it better for CPE work with other efforts to provide full
support?


>These use cases are definitely within the intended scope of
>CPE, as references to standard product names are needed.  If all he
>needs to support these use-cases is a flag that indicates if a CPE
>name refers to a discrete product or not, I think we need to find
>a way to support this in an official CPE capacity.

I very much want to hear from others in the community about the above
statement.  Is this within the scope of CPE?  Should CPE work to define
additional metadata related to the identifier?  Or should CPE leave
this metadata work to others and focus solely on building the list of
identifiers?

A couple of points that I would like to bring to the table for this ...

* What is discrete to one user might not be discrete to another.  How
would CPE make this determination?  Is WinXP Pro discrete (I think that
is what you buy) or is WinXP Pro SP1 discrete (after you download the
SP)?  Actually, in reality you buy WinXP Pro English so is that
discrete?  This is similar to the issue of providing weights to
vulnerabilities.  Everyone has a different answer for what the weight
should be.  If the enumeration tries to set one as official, those that
don't agree will be alienated.

* Does expanding the scope of CPE beyond the task of providing
identifiers reduce our ability to succeed?  This is a lesson that has
been learned in the past.  CVE was one of the first to just focus on
the identifier and leave the metadata problem to others.  This resulted
in an enumeration that has by all accounts succeeded.  Other efforts
have tried to do too much and have failed (recently look at AVDL).  CPE
has started down the CVE path and focused on the identifier, my
personal suggestion is to stay on course.

* This type of metadata is perfect for an external data repository
built on CPE.  This allows CPE to focus on the identifiers, and the
repository to focus on supplying the use-case specific metadata.  These
repositories would not be competing with CPE, but rather leveraging
CPE.  CPE would allow these repositories to share information, and
users to pull information from each of the repositories.


I think this discussion is extremely useful for this community and I
urge everyone to weigh-in with their thoughts.  I personally think that
the issue of where metadata lives is at the root of many of the
concerns that get brought up.  Should CPE focus on just the identifier?
Or should it also focus on providing a repository of metadata related
to the identifier?

Thanks
Drew

-----BEGIN PGP SIGNATURE-----
Version: 9.6.3 (Build 3017)

wsBVAwUBSE/8Ip3xz8BLNKAgAQi0VggAkQiqwALtvBFChrrdpfReCP7f/Q+mS9Ph
G0VkmDIz3kJIJ5CHsEMSYmW70CbLhcN3sGAvBGApl3UaLDqhjlxR1iQjZ1W6rjGj
3M/qNpWghVtDj3c97HTz6PLB5J0UcYfn7YFmJ2B5QnvkdbVJ7RZXMmMwDOER6kUv
yCmPOpapzGaUrkRB4XpFccxKIO6ppFwEOa5nrqQ00/r8ykzpC558JGag/bs1nWpe
ye5yC3G3yonMriFwQMxcpSBNrAwO8v0lxv//EWUWfmv2BXzVajWbcmr3HlbYWOg6
zGJmpKNnzIaQ78sTu6B2MdmDk/JdRfT+1hVFQaejNhy5lxyM6Fbl5Q==
=t2qc
-----END PGP SIGNATURE-----

Harold Booth-2

Re: Abstract and Concrete CPE Names in the Dictionary

Reply Threaded More More options
Print post
Permalink
In reply to this post by Andrew Buttner
My response is in-line below.


> >These capabilities are key for asset, license and procurement management
> >applications.
>
> I completely agree, and hopefully CPE enables these types of
> capabilities, but is it best for CPE to try and provide full support?
> Or is it better for CPE work with other efforts to provide full
> support?
>
> >These use cases are definitely within the intended scope of
> >CPE, as references to standard product names are needed.  If all he
> >needs to support these use-cases is a flag that indicates if a CPE
> >name refers to a discrete product or not, I think we need to find
> >a way to support this in an official CPE capacity.
>
> I very much want to hear from others in the community about the above
> statement.  Is this within the scope of CPE?  Should CPE work to define
> additional metadata related to the identifier?  Or should CPE leave
> this metadata work to others and focus solely on building the list of
> identifiers?

Why shouldn't CPE try to provide support for these types of capabilities?  Why
add another standard to the mix when CPE can handle this use case with some
minor changes?  Admittedly, CPE cannot be all things to all people but it should
handle the basic problem of communicating product/platform information across
various domains.  The domains need to include not only vulnerability and
checklist data providers but other aspects of an enterprise such as asset and
licensing management.  My understanding is that the CPE community has decided
that part, vendor, product, version, update, edition, and language are
sufficient to uniquely identify a product.  Is this understanding correct?  If
not what could be added to allow for unique identification?

Another way to solve this without even requiring adding a bit is if the
"official" dictionary contains CPEs down to the language level, even if that
requires specifying default values for some of the components.  According to the
matching algorithm, CPEs using fewer component lengths are implied.  Data
providers could provide the meta-data for CPEs of fewer components as needed or
desired.

Taking the previous solution a bit farther I would argue that an official "CPE"
identifier is always all seven components.  Shorter CPEs are merely useful in
the context of matching or where less specificity is needed.  Since you are
concerned with meta-data creep why not have the "official" dictionary provide
just the identifiers with maybe a brief description and no other additional
meta-data?  Titles, references, checks, and any other meta-data would be
value-add to the CPE provided by data providers (like the NVD).

Addressing your points:

> * What is discrete to one user might not be discrete to another.  How
> would CPE make this determination?  Is WinXP Pro discrete (I think that
> is what you buy) or is WinXP Pro SP1 discrete (after you download the
> SP)?  Actually, in reality you buy WinXP Pro English so is that
> discrete?  This is similar to the issue of providing weights to
> vulnerabilities.  Everyone has a different answer for what the weight
> should be.  If the enumeration tries to set one as official, those that
> don't agree will be alienated.

I strongly disagree with the analogy to CVEs and CVSS scoring.  What is being
talked about here is what constitutes an official identifier not what is the
particular value for a piece of meta-data.

As mentioned earlier, an "official" CPE should be as fine-grained as the
standard currently defines.  Less granular CPEs are always implied by the more
specific ones.  In this way no one is "alienated"; users of the less granular
CPEs still have an agreed upon identifier.  Taking your examples above:
Initial release of the English Language version of WinXP Pro could be:
cpe:/o:microsoft:windows_xp::gold:pro:en

Once Service Pack 1 is released (either for download or eventually sold retail)
a new entry would be:
cpe:/o:microsoft:windows_xp::sp1:pro:en

If a reference to only Microsoft Windows XP is desired then the cpe:
cpe:/o:microsoft:windows_xp
would refer to both of these CPEs in this example.

>
> * Does expanding the scope of CPE beyond the task of providing
> identifiers reduce our ability to succeed?  This is a lesson that has
> been learned in the past.  CVE was one of the first to just focus on
> the identifier and leave the metadata problem to others.

I have two points to this.  First the problem that CPE is trying to solve is
more difficult than for CVEs.  Not only do we wish to communicate about specific
products but we wish to also talk about groups of them as well.  Second CPE
already extends beyond just the task of providing identifiers.  The CPE language
structure codifies how to combine various products together to create a
platform, and the CPE dictionary specification describes how to communicate
lists of CPEs along with the meta-data.  I agree that CPE should not attempt to
describe every possible association of meta-data to a particular identifier.
But CPE should specify a minimum set of "globally useful" meta-data as well as a
mechanism to allow for arbitrary meta-data associations, facilitating
communication of this meta-data between products which use CPEs.  In the current
version of the CPE specification the required encoded meta-data could be the
seven component pieces of a CPE identifier.

>
> * This type of metadata is perfect for an external data repository
> built on CPE.  This allows CPE to focus on the identifiers, and the
> repository to focus on supplying the use-case specific metadata.  These
> repositories would not be competing with CPE, but rather leveraging
> CPE.  CPE would allow these repositories to share information, and
> users to pull information from each of the repositories.

See above.

> Should CPE focus on just the identifier?
> Or should it also focus on providing a repository of metadata related
> to the identifier?

Ultimately, I would like to see CPE as a means to communicate about products.  I
think the CPE specification should not only provide a means to associate an
identifier with a product but also a way to communicate about that product or
groups of products.  The CPE specification should also provide a standardized
means to associate arbitrary meta-data with a CPE as well as a way to describe
this meta-data.

-Harold
Ernest Park-2

Re: Abstract and Concrete CPE Names in the Dictionary

Reply Threaded More More options
Print post
Permalink
I think an element that we are missing is a definitive identifier that associate a human friendly name with an absolute ID.
 
When a name is created, we need MD5s (as example) for a definitive file that identifies that app, release, patch.
 
Any name therafter is a valuable identifier, but can be aliased. In this way, multiple names can identify a single release. Additionally, multiple machine identifiers can be identified with a "name".
 
Regarding the comments below , I thought of a few things
 
 
 
 
The problem above is that we go from distinct naming, where we are identifying "an electronic thing" to now identifying "an electronic thing" and it's condition of being.
 
Sort of like a license plate that has to identify the owner and the highway on which the car operates, all in the 7 characters on the plate.
 
 
So, maybe what is missing is the further development of the concept of a "part" - cpe:/a:.  
 
Does there need to be greater granularity with parts, in order to identify not only operating systems, those that are 64bit, 32bit, and so on, specific to a chipset, etc?
 
Then, if we have a "part" identifier, then each name will have "<vendor>:<app>:<release>", will be joined with "part".
 
 
Part may need to be decoupled from name in order to allow highest order names to be built before all the granular "part" details are finished.
 
 
 
In sum, part becomes a distinction of an application, but is different than a release.
 
 
 
 


 
On Wed, Jun 11, 2008 at 12:35 PM, Harold Booth <[hidden email]> wrote:
My response is in-line below.


> >These capabilities are key for asset, license and procurement management
> >applications.
>
> I completely agree, and hopefully CPE enables these types of
> capabilities, but is it best for CPE to try and provide full support?
> Or is it better for CPE work with other efforts to provide full
> support?
>
> >These use cases are definitely within the intended scope of
> >CPE, as references to standard product names are needed.  If all he
> >needs to support these use-cases is a flag that indicates if a CPE
> >name refers to a discrete product or not, I think we need to find
> >a way to support this in an official CPE capacity.
>
> I very much want to hear from others in the community about the above
> statement.  Is this within the scope of CPE?  Should CPE work to define
> additional metadata related to the identifier?  Or should CPE leave
> this metadata work to others and focus solely on building the list of
> identifiers?

Why shouldn't CPE try to provide support for these types of capabilities?  Why
add another standard to the mix when CPE can handle this use case with some
minor changes?  Admittedly, CPE cannot be all things to all people but it should
handle the basic problem of communicating product/platform information across
various domains.  The domains need to include not only vulnerability and
checklist data providers but other aspects of an enterprise such as asset and
licensing management.  My understanding is that the CPE community has decided
that part, vendor, product, version, update, edition, and language are
sufficient to uniquely identify a product.  Is this understanding correct?  If
not what could be added to allow for unique identification?

Another way to solve this without even requiring adding a bit is if the
"official" dictionary contains CPEs down to the language level, even if that
requires specifying default values for some of the components.  According to the
matching algorithm, CPEs using fewer component lengths are implied.  Data
providers could provide the meta-data for CPEs of fewer components as needed or
desired.

Taking the previous solution a bit farther I would argue that an official "CPE"
identifier is always all seven components.  Shorter CPEs are merely useful in
the context of matching or where less specificity is needed.  Since you are
concerned with meta-data creep why not have the "official" dictionary provide
just the identifiers with maybe a brief description and no other additional
meta-data?  Titles, references, checks, and any other meta-data would be
value-add to the CPE provided by data providers (like the NVD).

Addressing your points:

> * What is discrete to one user might not be discrete to another.  How
> would CPE make this determination?  Is WinXP Pro discrete (I think that
> is what you buy) or is WinXP Pro SP1 discrete (after you download the
> SP)?  Actually, in reality you buy WinXP Pro English so is that
> discrete?  This is similar to the issue of providing weights to
> vulnerabilities.  Everyone has a different answer for what the weight
> should be.  If the enumeration tries to set one as official, those that
> don't agree will be alienated.

I strongly disagree with the analogy to CVEs and CVSS scoring.  What is being
talked about here is what constitutes an official identifier not what is the
particular value for a piece of meta-data.

As mentioned earlier, an "official" CPE should be as fine-grained as the
standard currently defines.  Less granular CPEs are always implied by the more
specific ones.  In this way no one is "alienated"; users of the less granular
CPEs still have an agreed upon identifier.  Taking your examples above:
Initial release of the English Language version of WinXP Pro could be:
cpe:/o:microsoft:windows_xp::gold:pro:en

Once Service Pack 1 is released (either for download or eventually sold retail)
a new entry would be:
cpe:/o:microsoft:windows_xp::sp1:pro:en

If a reference to only Microsoft Windows XP is desired then the cpe:
cpe:/o:microsoft:windows_xp
would refer to both of these CPEs in this example.

>
> * Does expanding the scope of CPE beyond the task of providing
> identifiers reduce our ability to succeed?  This is a lesson that has
> been learned in the past.  CVE was one of the first to just focus on
> the identifier and leave the metadata problem to others.

I have two points to this.  First the problem that CPE is trying to solve is
more difficult than for CVEs.  Not only do we wish to communicate about specific
products but we wish to also talk about groups of them as well.  Second CPE
already extends beyond just the task of providing identifiers.  The CPE language
structure codifies how to combine various products together to create a
platform, and the CPE dictionary specification describes how to communicate
lists of CPEs along with the meta-data.  I agree that CPE should not attempt to
describe every possible association of meta-data to a particular identifier.
But CPE should specify a minimum set of "globally useful" meta-data as well as a
mechanism to allow for arbitrary meta-data associations, facilitating
communication of this meta-data between products which use CPEs.  In the current
version of the CPE specification the required encoded meta-data could be the
seven component pieces of a CPE identifier.

>
> * This type of metadata is perfect for an external data repository
> built on CPE.  This allows CPE to focus on the identifiers, and the
> repository to focus on supplying the use-case specific metadata.  These
> repositories would not be competing with CPE, but rather leveraging
> CPE.  CPE would allow these repositories to share information, and
> users to pull information from each of the repositories.

See above.

> Should CPE focus on just the identifier?
> Or should it also focus on providing a repository of metadata related
> to the identifier?

Ultimately, I would like to see CPE as a means to communicate about products.  I
think the CPE specification should not only provide a means to associate an
identifier with a product but also a way to communicate about that product or
groups of products.  The CPE specification should also provide a standardized
means to associate arbitrary meta-data with a CPE as well as a way to describe
this meta-data.

-Harold

Eirik Iverson

Re: Abstract and Concrete CPE Names in the Dictionary

Reply Threaded More More options
Print post
Permalink
In reply to this post by Ernest Park-2
Some javascript/style in this post has been disabled (why?)

Ernest –

 

Nice database example.

 

The questions below illustrate my confusion about CPE…

 

How was your inventory database actually populated?  Was it done so explicitly via data entry or import (e.g., host XYZ has Microsoft Excel 2003)?  Or, was it derived from another table with retrieved host data such as “C:\Program Files\Microsoft Office\OFFICE11\excel.exe” and “11.0.8211.0”?

 

How does one systematically determine asset inventory for an entire population of endpoints based on data that can be collected so that one can take advantage of the rest of the framework (e.g., CVE, CVSS, etc.)? 

 

Is the scope of CPE limited such that tools (e.g., patch management, configuration management, vulnerability assessment, etc.) must first identify assets and report their findings in a CPE compliant manner so that one can then leverage all of this great work?  

 

Cheers,


Eirik

 


From: Ernest Park [mailto:[hidden email]]
Sent: Tuesday, June 10, 2008 5:56 PM
To: [hidden email]
Subject: Re: [CPE-DISCUSSION-LIST] Abstract and Concrete CPE Names in the Dictionary

 

Hi -

 

I have 4 databases, all which talk to each other indirectly through common CPE language.

 

 

I can do the following . . .

 

select vendor,application,part_v2,release,inventory_db.application_installed_date,master_db.platform,external_nvd.nvd_name,current

from open_source_db

 

inner join inventory_db on (inventory_db.application= open_source_db.application)

inner join external_nvd on (external_nvd.application=open_source_db.application)

inner join master_db on (master_db.application=open_source_db.application)

 

where inventory_db.application_installed_status = 'YES'

and

where inventory_db.application_currentreleasename is 'NOT NULL'

into outfile 'CPE_v2-inventory list.csv'

 

 

Excuse any typos above - I wrote a simple query to illustrate the point.

 

Using common CPE syntax in all my databases, I can reliably exchange data between databases, and output them with an identifier that will conform to having CPE pieces.

 

In my output, the URI can be parsed from the output, even in Excel, such as . . .

 

=CONCATENATE("cpe",A4,B4,":",C4,":",D4,"   =",E4,"=",F4,"=",G4) 

 

Fields in < > refer to Excel cells.

 

 

=concatenate("cpe",<part>,<vendor>,<application>,<release>,<installed_date>,<platform>,<cve_names>)

 

 

 

 

 

 

 

 

part

vendor

application

release

installed_date

platform

cve_names

:/a:

erniesoft

killer_app

1.1

4022008

linux

cve-2008-7021

 

 

 

 

 

 

 

 

 

cpe:/a:erniesoft:killer_app:1.1   =4022008=linux=cve-2008-7021

 

Note that platform and install date are data fields that I supply and in this case are not constrained to anything. Platform could be "External approved WIndows - v12", and so on.

 

 

Therefore,

 

I can output an inventory report down to the installed release, identify when it was installed, what platform, and the latest relevant CVE name. I do a report for subscribers that includes CVEs associated to currently installed releases and sorts results by CVSS, Work load index and my risk score. I scan 5 databases and the NVD XML to output the inventory risk management report. The CPE gives me a uniform handle by which to describe things across datasources.

 

 

So - NVD gives me CVE names, CVSS, WLI, I generate an inventory report using scanning tools, I apply poicy using policy management software, and so on.

 

If a user of my data wants to pull OSS license information, or all known releases, or patch status, etc, they can do so with a query built around CPE constructs. One of these days, a web service may front end the query above.

 

 

 

 

 

Ernie

 

 

 

 

 

On Tue, Jun 10, 2008 at 4:13 PM, David Waltermire <[hidden email]> wrote:

Drew,

From what I have been able to determine from Dave's emails, his use case is
based on the need to create database records that represent authoritative,
discrete product references in the CPE name format.  I would define a
discrete product reference as a CPE name that refers to a SKU or
electronically distributed content.  CPE names used in this fashion are not
generated by reference or by report, thus he is forced to look to the
official CPE dictionary.  Since this is a database application he is unable
to use the system inventory approach using OVAL definitions to qualify what
CPE names to use.  Instead he is looking for another authoritative hint.

Furthermore, it is not enough that all CPE names are unique; he is looking
for the set of CPE names that correspond directly on a one-to-one basis with
actual installed software.  Said a different way he needs the ability for a
tool to be able to associate a discrete product with a corresponding CPE
name.  This mapping must be able to occur at differing levels of abstraction
relative to the CPE components.  I believe for most products that we have
the granularity in the CPE name to accomplish this so why not do this?

These capabilities are key for asset, license and procurement management
applications.  These use cases are definitely within the intended scope of
CPE, as references to standard product names are needed.  If all he needs to
support these use-cases is a flag that indicates if a CPE name refers to a
discrete product or not, I think we need to find a way to support this in an
official CPE capacity.  If the CPE standard does not support these
use-cases, we risk alienating users/vendors.  The worst case scenario for
this situation is the creation of a competing standard.  This is not a win
for CPE.  We need to find a way forward to make this work.

Dave



-----Original Message-----
From: Buttner, Drew [mailto:[hidden email]]

Sent: Tuesday, June 10, 2008 1:45 PM
To: [hidden email]
Subject: Re: [CPE-DISCUSSION-LIST] Abstract and Concrete CPE Names in the
Dictionary

>They wish to express a definitive "has this" or "is this" relationship
>to the finest degree possible in CPE.  To the language level is
>preferred; however, I believe to the edition level is sufficient for
>most use cases.

This relationship is something that they need to define anyway (CPE doesn't
try to define it) so it is perfectly acceptable for them to say that the CPE
Name that has been assigned to the asset means "the asset is of this
platform type".



>To represent a "has this" or "is this" relationship to the desired level
>of specificity, a CPE assigned to an asset must be unique,

All CPE Names must be unique by definition.



>of the finest granularity possible

I would reword this to say "must be at the granularity they desire".  You
even stated that going down to language is not desired.



>and must not be susceptible to the scenario I laid
>out in question 2.a of my original email.

Agreed and I think this is covered.



>Currently, my stakeholders have a need to extract the entries from the
>dictionary that represent products or hardware that an asset can have or
>be.
>There is no need from my stakeholders to extract CPEs that represent
>"relates to" or "in the family of" information.

I think this might be the root of the confusion.  CPE does nothing to try
and support this.  All that CPE is trying to do is provide a list of all the
known identifiers.  The amount of metadata provided is very sparse.  Just
enough to know what it is that has been identified.  I think what you need
is additional metadata associated with a CPE that would enable you to search
through the list and find the CPE Names that are desired.

I think this is a very important use case but is one that is outside the
scope of CPE.  This problem is very similar to one that faces the CVE
community.  CVE is a list of vulnerability identifiers.  Users that need
more metadata rely on an external databases to get it.  These external
databases (NVD is an example) use the CVE identifier to tag each entry and
associate metadata to it.

I think what you need is for the "National Product Database" to be created.
Agree?  This is something that had come up at CPE Developer Days as well.



>Furthermore, my
>stakeholders have a need for a logical language that can be used to
>construct queries to identify assets that have certain configurations.
>These queries may be very general and leverage the wildcard features of
>the matching algorithm.  These queries may also be very specific and target
>a single language or edition of a product.  In the later case, the
>matching algorithm and library need to be able to return only assets that
have
>the exact product installed (which is where my concern for issue 2.a comes
>in).

This seems to align with the goals of OVAL.


Thanks
Drew

 

Ernest Park-2

Re: Abstract and Concrete CPE Names in the Dictionary

Reply Threaded More More options
Print post
Permalink
Hi Eirik -

On Wed, Jun 11, 2008 at 3:39 PM, Eirik Iverson <[hidden email]> wrote:
Ernest –
 
Nice database example.
 
The questions below illustrate my confusion about CPE…
 
How was your inventory database actually populated?  Was it done so explicitly via data entry or import (e.g., host XYZ has Microsoft Excel 2003)?  Or, was it derived from another table with retrieved host data such as "C:\Program Files\Microsoft Office\OFFICE11\excel.exe" and "11.0.8211.0"?
 
Neither. I use Palamida IP Amplifier product. I added custom signatures and additional vendor, app, release metadata.
 
So, step 1. Use a tool that can scan unknown code and create an inventory. Ideally, choose a tool that outputs CPE constructs or a CPE URI. I push out the pieces, since I need to massage and correct misassociations later. By having the pieces, I can be certain of vendor, app, and just fix a release name.
 
 
 
IP Amp scans and IDs the unknown files. Once I have the products IDed, I start layering metadata in - so all Apache prods get a "vendor" attriibute like apache_software_foundation, and so on.
 
I maintain a number of parallel databases, so I use common application name searching across all DBs. When we automatically add a new product to the inventory database, if no metadata exists, the report returns the best match to LIKE searches. We can then manually go in and either permanently add a new product, a new alias for existing, or correct identification for a product that was added.
 
http://gpl3.palamida.com has a downloadable database that has all the pieces needed to parse into a CPE name.
 
 
The inventory report is used as part of a wget/google api script. Using what I know, I query the internet and attempt to auto-match metadata, like URL, project home page, vendor name, product description, and so on.
 
 
How does one systematically determine asset inventory for an entire population of endpoints based on data that can be collected so that one can take advantage of the rest of the framework (e.g., CVE, CVSS, etc.)? 
 
Code scanning tools do exactly this. I use IP Amp to scan, find releases, associate CPE names, and so on.
 
BTW - due to data issues, leveraging CVE data is not entirely enabled via CPE names. I end up doing indirect vendor,app, release lookups against the CVE data to find matching CVEs to a release.
 
 
 
Is the scope of CPE limited such that tools (e.g., patch management, configuration management, vulnerability assessment, etc.) must first identify assets and report their findings in a CPE compliant manner so that one can then leverage all of this great work?  
 
 
Yes. CPE is a name string. It is only a way to assure that cpe:/a:vendor:app:release means the same to you that it does to me.
 
Once you have a way to identify an asset via a distinct CPE name, or a higher level part of the asset, like cpe:/a:vendor, you can now map metadata to the identifier.
 
 
 
Much of my data is automatically collected, but some of it, like GPL3, is hand collected. I use forms to constrain the information going into the database so that it includes all the required elements, and interns resolve any "errata", unresolved names, bad information, failure to map within databasdes - like unknown or new vendors, products, etc.
 
 
CPE is not a dat management solution. It is just a way to share distinct information with common keys between users and electronic sources.
 
 
 
 
 
Cheers,
Eirik
 

From: Ernest Park [mailto:[hidden email]]
Sent: Tuesday, June 10, 2008 5:56 PM
Subject: Re: [CPE-DISCUSSION-LIST] Abstract and Concrete CPE Names in the Dictionary
 
Hi -
 
I have 4 databases, all which talk to each other indirectly through common CPE language.
 
 
I can do the following . . .
 
select vendor,application,part_v2,release,inventory_db.application_installed_date,master_db.platform,external_nvd.nvd_name,current
from open_source_db
 
inner join inventory_db on (inventory_db.application= open_source_db.application)
inner join external_nvd on (external_nvd.application=open_source_db.application)
inner join master_db on (master_db.application=open_source_db.application)
 
where inventory_db.application_installed_status = 'YES'
and
where inventory_db.application_currentreleasename is 'NOT NULL'
into outfile 'CPE_v2-inventory list.csv'
 
 
Excuse any typos above - I wrote a simple query to illustrate the point.
 
Using common CPE syntax in all my databases, I can reliably exchange data between databases, and output them with an identifier that will conform to having CPE pieces.
 
In my output, the URI can be parsed from the output, even in Excel, such as . . .
 
=CONCATENATE("cpe",A4,B4,":",C4,":",D4,"   =",E4,"=",F4,"=",G4) 
 
Fields in < > refer to Excel cells.
 
 
=concatenate("cpe",<part>,<vendor>,<application>,<release>,<installed_date>,<platform>,<cve_names>)
 
 
 
 
 
 
 
 
part
vendor
application
release
installed_date
platform
cve_names
:/a:
erniesoft
killer_app
1.1
4022008
linux
cve-2008-7021
 
 
 
 
 
 
 
 
 
cpe:/a:erniesoft:killer_app:1.1   =4022008=linux=cve-2008-7021
 
Note that platform and install date are data fields that I supply and in this case are not constrained to anything. Platform could be "External approved WIndows - v12", and so on.
 
 
Therefore,
 
I can output an inventory report down to the installed release, identify when it was installed, what platform, and the latest relevant CVE name. I do a report for subscribers that includes CVEs associated to currently installed releases and sorts results by CVSS, Work load index and my risk score. I scan 5 databases and the NVD XML to output the inventory risk management report. The CPE gives me a uniform handle by which to describe things across datasources.
 
 
So - NVD gives me CVE names, CVSS, WLI, I generate an inventory report using scanning tools, I apply poicy using policy management software, and so on.
 
If a user of my data wants to pull OSS license information, or all known releases, or patch status, etc, they can do so with a query built around CPE constructs. One of these days, a web service may front end the query above.
 
 
 
 
 
Ernie
 
 
 
 
 
On Tue, Jun 10, 2008 at 4:13 PM, David Waltermire <[hidden email]> wrote:
Drew,
From what I have been able to determine from Dave's emails, his use case is
based on the need to create database records that represent authoritative,
discrete product references in the CPE name format.  I would define a
discrete product reference as a CPE name that refers to a SKU or
electronically distributed content.  CPE names used in this fashion are not
generated by reference or by report, thus he is forced to look to the
official CPE dictionary.  Since this is a database application he is unable
to use the system inventory approach using OVAL definitions to qualify what
CPE names to use.  Instead he is looking for another authoritative hint.
Furthermore, it is not enough that all CPE names are unique; he is looking
for the set of CPE names that correspond directly on a one-to-one basis with
actual installed software.  Said a different way he needs the ability for a
tool to be able to associate a discrete product with a corresponding CPE
name.  This mapping must be able to occur at differing levels of abstraction
relative to the CPE components.  I believe for most products that we have
the granularity in the CPE name to accomplish this so why not do this?
These capabilities are key for asset, license and procurement management
applications.  These use cases are definitely within the intended scope of
CPE, as references to standard product names are needed.  If all he needs to
support these use-cases is a flag that indicates if a CPE name refers to a
discrete product or not, I think we need to find a way to support this in an
official CPE capacity.  If the CPE standard does not support these
use-cases, we risk alienating users/vendors.  The worst case scenario for
this situation is the creation of a competing standard.  This is not a win
for CPE.  We need to find a way forward to make this work.
Dave
-----Original Message-----
From: Buttner, Drew [mailto:[hidden email]]
Sent: Tuesday, June 10, 2008 1:45 PM
Subject: Re: [CPE-DISCUSSION-LIST] Abstract and Concrete CPE Names in the
Dictionary
>They wish to express a definitive "has this" or "is this" relationship
>to the finest degree possible in CPE.  To the language level is
>preferred; however, I believe to the edition level is sufficient for
>most use cases.
This relationship is something that they need to define anyway (CPE doesn't
try to define it) so it is perfectly acceptable for them to say that the CPE
Name that has been assigned to the asset means "the asset is of this
platform type".
>To represent a "has this" or "is this" relationship to the desired level
>of specificity, a CPE assigned to an asset must be unique,
All CPE Names must be unique by definition.
>of the finest granularity possible
I would reword this to say "must be at the granularity they desire".  You
even stated that going down to language is not desired.
>and must not be susceptible to the scenario I laid
>out in question 2.a of my original email.
Agreed and I think this is covered.
>Currently, my stakeholders have a need to extract the entries from the
>dictionary that represent products or hardware that an asset can have or
>be.
>There is no need from my stakeholders to extract CPEs that represent
>"relates to" or "in the family of" information.
I think this might be the root of the confusion.  CPE does nothing to try
and support this.  All that CPE is trying to do is provide a list of all the
known identifiers.  The amount of metadata provided is very sparse.  Just
enough to know what it is that has been identified.  I think what you need
is additional metadata associated with a CPE that would enable you to search
through the list and find the CPE Names that are desired.
I think this is a very important use case but is one that is outside the
scope of CPE.  This problem is very similar to one that faces the CVE
community.  CVE is a list of vulnerability identifiers.  Users that need
more metadata rely on an external databases to get it.  These external
databases (NVD is an example) use the CVE identifier to tag each entry and
associate metadata to it.
I think what you need is for the "National Product Database" to be created.
Agree?  This is something that had come up at CPE Developer Days as well.
>Furthermore, my
>stakeholders have a need for a logical language that can be used to
>construct queries to identify assets that have certain configurations.
>These queries may be very general and leverage the wildcard features of
>the matching algorithm.  These queries may also be very specific and target
>a single language or edition of a product.  In the later case, the
>matching algorithm and library need to be able to return only assets that
have
>the exact product installed (which is where my concern for issue 2.a comes
>in).
This seems to align with the goals of OVAL.
Thanks
Drew
 

dwhite-5

mapping applicability to install

Reply Threaded More More options
Print post
Permalink
In reply to this post by Andrew Buttner
The NIST NSRL has such a mapping, though not currently in CPE format.

On Jun 10, 2008, at 11:00 AM, Buttner, Drew wrote:

>  To solve your
> need, someone must create a mapping that relates different CPE Names  
> based
> on applicability and ability to install.  This mapping might look  
> like:
>
> Host OS                          Application
> ---------------------------------------------------
> cpe:/o:microsoft:windows_xp      cpe:/a:vendor:app1
> cpe:/o:microsoft:windows_xp      cpe:/a:vendor:app3
> cpe:/o:microsoft:windows_xp::sp1 cpe:/a:vendor:app2
> cpe:/o:microsoft:windows_2003    cpe:/a:vendor:app3
>
>
> Application                      Host OS
> ---------------------------------------------------
> cpe:/a:vendor:app1               cpe:/o:microsoft:windows_xp
> cpe:/a:vendor:app1               cpe:/o:microsoft:windows_2003
>

Douglas White              National Institute of Standards and  
Technology
NIST, 100 Bureau Drive Stop 8970, Gaithersburg, MD 20899-8970
      National Software Reference Library - http://www.nsrl.nist.gov
Voice: 301-975-4761     Fax: 301-975-6097   Email: [hidden email]
My opinions aren't necessarily my employer's nor any other  
organization's.
"Even if you're on the right track, you'll get run over if you just  
sit there" - Will Rogers
dwhite-5

Re: Abstract and Concrete CPE Names in the Dictionary

Reply Threaded More More options
Print post
Permalink
In reply to this post by Andrew Buttner
On Jun 11, 2008, at 8:34 AM, Buttner, Drew wrote:

>> These use cases are definitely within the intended scope of
>> CPE, as references to standard product names are needed.  If all he
>> needs to support these use-cases is a flag that indicates if a CPE
>> name refers to a discrete product or not, I think we need to find
>> a way to support this in an official CPE capacity.
>
> I very much want to hear from others in the community about the above
> statement.  Is this within the scope of CPE?  Should CPE work to  
> define
> additional metadata related to the identifier?  Or should CPE leave
> this metadata work to others and focus solely on building the list of
> identifiers?

> * This type of metadata is perfect for an external data repository
> built on CPE.  This allows CPE to focus on the identifiers, and the
> repository to focus on supplying the use-case specific metadata.  
> These
> repositories would not be competing with CPE, but rather leveraging
> CPE.  CPE would allow these repositories to share information, and
> users to pull information from each of the repositories.

The NSRL is looking to the CPE for the exact reason above; to leverage
the CPE in order to share information. NSRL collects metadata, e.g.
SHA-1/MD5/CRC hashes, filename, directory path, bytesize, MAC  
timestamps, etc.
and we are very receptive to collecting other metadata or running  
algorithms
against our collection, if we don't already have what the community  
needs.


Douglas White              National Institute of Standards and  
Technology
NIST, 100 Bureau Drive Stop 8970, Gaithersburg, MD 20899-8970
      National Software Reference Library - http://www.nsrl.nist.gov
Voice: 301-975-4761     Fax: 301-975-6097   Email: [hidden email]
My opinions aren't necessarily my employer's nor any other  
organization's.
"Even if you're on the right track, you'll get run over if you just  
sit there" - Will Rogers
Gary Newman-2

Re: Abstract and Concrete CPE Names in the Dictionary

Reply Threaded More More options
Print post
Permalink
In reply to this post by Ernest Park-2
Hi Ernest,

Do your tools use vendor provided data to name the vendor and application, e.g.
those strings from an RPM distribution?  Or, are the vendor and application
names hand added?

Cheers,

        -Gary-

On 11 Jun 2008 at 21:45, Ernest Park wrote:

> Neither. I use Palamida IP Amplifier product. I added custom signatures and
> additional vendor, app, release metadata.
>
> So, step 1. Use a tool that can scan unknown code and create an inventory.
> Ideally, choose a tool that outputs CPE constructs or a CPE URI. I push out
> the pieces, since I need to massage and correct misassociations later. By
> having the pieces, I can be certain of vendor, app, and just fix a release
> name.
>
>
>
> IP Amp scans and IDs the unknown files. Once I have the products IDed, I start
> layering metadata in - so all Apache prods get a "vendor" attriibute like
> apache_software_foundation, and so on.
Ernest Park-2

Re: Abstract and Concrete CPE Names in the Dictionary

Reply Threaded More More options
Print post
Permalink
One example I have worked as follows -
 
all automatic -
 
for a given archive file, I extract onto the file system, crawl the tree and look for strings - copyright, published, copying, and i look at the attribution in the header of source files - 
 
We assemble an array of search terms. 

(some human intervention here usually)
 
We feed the most common strings into the google API, and then get back the project URL and the vendor URL.
 
 
Keep in mind - when doing this by hand, we can find the vendor in a minute. Automatically, we look for directory names, key file names, attribution, and then we intersect this.
 
 
Google API will give me the vendor "likely name", the URL, the project home page, the project title, the short title (like jboss versus JBoss Application Server).
 
Once I have this, a human reviews the results, and then we crawl the web for releases, and feed the releases to WGET. I string search the release names to get the unique elements from the archive file (like 3.1.a out of jboss-3.1.a.tar.gz).
 
 
 
This can be done without a third party code scanner.
 
 
 
For the GPLv3 project at http://gpl3.palamida.com, all entries are hand added, but I use the Google API to populate all fields.
 
THe project starts with crawlers that look for releases that include the GPLv3 license. Researchers take the release info, and download the open source project. They expand the project and look for the short name, the title, and the vendor. The web UI uses this data to narrow the possible results down, and the researcher confirms the selection of the Vendor, the App, and the release. These names are compared to the existing vendor/apps hosted by NVD.NIST.GOV, and CPE friendly names are suggested if they already exist for vendor and app.
 
A human makes the association, and then the record is stored in the database.
 
If you have the ability to use a string search, you can certainly associate a distro with a name. If you need to get more automatic, you can build string tools, or use commercial or OSS scanners to get you the first part.
 
It seems that our forms in the collection process combine some human smarts with web crawlers and Google API to attempt to resolve the most likely CPE name information quickly.
 
 
Ernie
 
 
 
On Thu, Jun 12, 2008 at 6:54 PM, Gary Newman <[hidden email]> wrote:
Hi Ernest,

Do your tools use vendor provided data to name the vendor and application, e.g.
those strings from an RPM distribution?  Or, are the vendor and application
names hand added?

Cheers,

       -Gary-

On 11 Jun 2008 at 21:45, Ernest Park wrote:

> Neither. I use Palamida IP Amplifier product. I added custom signatures and
> additional vendor, app, release metadata.
>
> So, step 1. Use a tool that can scan unknown code and create an inventory.
> Ideally, choose a tool that outputs CPE constructs or a CPE URI. I push out
> the pieces, since I need to massage and correct misassociations later. By
> having the pieces, I can be certain of vendor, app, and just fix a release
> name.
>
>
>
> IP Amp scans and IDs the unknown files. Once I have the products IDed, I start
> layering metadata in - so all Apache prods get a "vendor" attriibute like
> apache_software_foundation, and so on.

1 2