|
|
|
Ernest Park-2
|
I have a dictionary of a few hundred thousand OSS project names with If someone writes the web service front end, I will publish all of this Any volunteers?
On Wed, Jun 4, 2008 at 1:15 PM, Buttner, Drew <[hidden email]>
The "alias" feature is right along the lines of what we discussed at Thanks |
||||||||||||||||
|
Vladimir Giszpenc
|
Some javascript/style in this post has been disabled (why?)
1.
NVD is
the maintainer of the official CPE dictionary and it would make sense to add
these as beta/unvetted/unofficial content to that dictionary 2.
Verifying
that none are dupes or wrong in some other way is a large undertaking. 3.
If I
remember correctly, the CPE ids are in CPE 1.0 format, so they would need to be
transformed to 2.1. If Dave Waltermire and company need help
setting up the web services, database and other plumbing, I may be able to
contribute developer time to such a project. Have a nice weekend! Best regards,
From: Ernest Park
[mailto:[hidden email]] I have a
dictionary of a few hundred thousand If
someone writes the web service front end, I will publish all of this Any
volunteers?
On Wed,
Jun 4, 2008 at 1:15 PM, Buttner, Drew <[hidden email]>
The
"alias" feature is right along the lines of what we discussed at Thanks |
|
Ernest Park-2
|
The problem lies in adding yet more unofficial content. My research team has a recognized expertise in open source software. Our contribution and expertience warrants my work to be treated as "authoritative" for open source software. Doing so allows the database to grow quickly, despite possible disagreements. Not doing so leaves things exactly as they are.
If we accepted 100,000 products and the 1,000,000 releases as "beta", who plans to review this, using what rules and metrics, and over what time?
Why would we accept as official contributions from Symantec but not from me? Product and release names for open source software are more critical to my business than any of the authoritative sources you currently have. Open source software provides no centralized source for data. My team and I look at each release, each license file, and validate all related information, and then keep such on file. We make effort to be thorough and complete, since our work represents the contributions of thousands who are not doing this for themselves.
By having a flexible "alias"concept, the community can accept the names, and modify the names through aliasing to support new standards without undermining the volume contributions.
We must invite volume contribution from trusted sources. What is the criteria that we apply to trust contributions as official from certain authoritative sources?
An example of my research is . . .
Ernie
On Fri, Jun 6, 2008 at 11:31 AM, Vladimir Giszpenc <[hidden email]> wrote:
|
||||||||||||||||
|
Thomas R. Jones
|
In reply to this post
by Ernest Park-2
On Fri, 2008-06-06 at 16:26 -0400, Ernest Park wrote:
> Please keep in mind that I am deeply involved with managing and > maintaining distinct records for millions of releases and billions of > files and related components. I believe that what CPE represents is > incredibly important. > > > Comments below - > > > On Fri, Jun 6, 2008 at 3:56 PM, Thomas R. Jones > <[hidden email]> wrote: > > Responses inline. > > Sent from my iPhone > On Jun 6, 2008, at 2:21 PM, Ernest Park > <[hidden email]> wrote: > > > > > Hi Tom, notes inline. > > > > > > On Fri, Jun 6, 2008 at 2:51 PM, Thomas R. Jones > > <[hidden email]> wrote: > > > > Hello ernest, > > > > I have a few reservations. First of all, I am one of > > a small minority of open source researchers and > > contributors to cpe. So I would like to extend a > > welcome to you and your colleagues. Second, the vast > > amount of contributions is almost disconcerning. I > > am sure yourself and your colleagues have worked > > diligently to provide a much needed service to this > > community. And I for one thank you! > > > > > > > > However, what you propose is very difficult to > > envision on such a scale. No one in the community, > > that I know of, has had an opprtunity to evaluate > > the contributions proposed. This should be a > > pre-requisite before anyone jumps on board. A view > > of the database structure is vital. > > > > Why do you need to view the data?? > > The data is what is relevant. If I, and others that may > possibly contribute, are not allowed to have access to said > data then it is difficult to provide our support. > > As an analogy, would you buy a car if you not only could not > see it but also not drive it? > > > There are many many reasons that any one of us may want to > obtain a subset of data. > > > > The analogy is incorrect. The CPE, despite the discussions here, is > intended by its own definition to be an identifier, a URI - like > string. In your analogy, this merely means that if I were buying a > car, I would want a license plate that distinctly identified my car. > Any additional data would be stored in my car, separate from that > record with the unique identifier. No. The analogy is correct. How may I know that my paint job is in fact a particular color if I may not see it? How do I know that my automobile is in fact made by a particular automaker if I can not see the emblem? How may I be assured that particular safety features I may rely on, if I can not definitively say are there for my utilization? > > The problem when we make CPE into a complex database is that we blur > so much the lines of what it is and is not that we dissuade > contributions and usage by the community. This is a political view and/or opinion that does not need to be brought to light within the conversation. Lest you forget, that I am too an open source contributor. I know all to well the complexities of contributing to a vendor majority sponsored standard. In fact I have done so through many standards within the w3c and IEEE communities. But we try as much as possible to reduce the amount of seclusion and segregation as this. And i'll be honest in my opinion that Mitre and the individuals charged with this project have done an outstanding job doing so! ;) > > The CPE is a name that points to something, and with an inferred > relational hierarchy in the name. > > If I want to deploy a database that supports CPE 1.x query, you do NOT > need to qualify the database. If I offer to provide, or keep secret, > anything beyond those elements which distinctly confirm a valid name > and its association with a distinct technology component, that should > be sufficient. But you are asking the community to put forth faith in an infrastructure that we have not seen. How can we do that? Is it an IP issue that may be at hand? Im sure that anyone here would put forth signatory recognition of an NDA if need be. Or do we just blindly go forth? > > > When we try to make CPE something it is not, it will never be what it > can be. If it is merely a naming identifier, it becomes a unification > point for data from multiple providers. I could allow software > companies to query my data. They may invite me to query theirs. The > common unification is the name. > > Nothing should matter to CPE beyond a valid name and association to a > distinct element no more than the DMW cares about what fuel you run in > the car. > > > > > CPE is not a database or a schema. It is a string identifier > > format for distinct technology elements - nothing more. The > > idea at the end of the day is to provide a dictionary of > > names. The data underlying that is irrelevant, may be > > proprietary, and may have nothing to do with defining a > > name. I continually see the problem of CPE that we all fall > > into the mistake of making it something more than it is. CPE > > is a phone book - a set of distinct and human friendly > > identifiers for technology assets, nothing more. > > > > If I can provide you with Vendor, Applicatioon, Title, > > Release, URL, maybe an MD5, as part of a query, then it is > > the result set you should be looking at. > > This statement relates to the first question. The subset IS > what is important. But how the data is obtained is also in > question. I simply would like to see the SQL structure. What > type of tables are utilized? Can they be easily restructured? > Are we inhibited by the structure to not provide future > advancements within the standard? May this data be replicated? > Does the SQL structure take into consideration > internationalization? > > > > From the CPE homepage (http://cpe.mitre.org) > > CPE™ is a structured naming scheme for information technology > systems, platforms, and packages. Based upon the generic > syntax for Uniform Resource Identifiers (URI), CPE includes a > formal name format, a language for describing complex > platforms, a method for checking names against a system, and a > description format for binding text and tests to a name. > > > There is not reference to SQL structure in the definition of CPE, nor > a reference implementation. CPE is NOT a database or a data storage > system of any kind. CPE does not denote a schema, but such information > can be stored in a number of formats while still containing CPE > compliant information. True. But your data is housed in a database. If it were a simple text file, as is the current dictionary, than we as a community would want to see it and verify. We would ensure that the character encoding is sufficient for the community to process. That there is not a structural issue within the XML nodes that inhibits its utilization. It is no different. We, as a community(and I may be speaking for only myself here---i do not presume to speak for all the cpe community), would ensure that the data quality and availability is intact. As an information security organization, you surely understand the need for compliance with this aspect of the TRIAD. > > I am sure Symantec and McAfee store proprietary information along with > having those components that support CPE in their data repositories, > but they would not more open these databases to review than I will. If > CPE is a name identifier constrained by elements, if I can provide the > elements, perhaps: > > vendor, URL, application, app home page, release, release file name > and URL, MD5 for release file, > > > any string containing components from above is an identifier. > > > > I could easily pose a few questions to you regarding the > database and informational manipulation if you would prefer. > > > > > > What are the fields required in order to accept a third party > contribution of a CPE name? > > > > > > Also, if I can provide something that nobody else has > > provided, why not use it until it is contested? If not, the > > database is perpetually bottlenecked by a subjective > > approval process that due to realistic limitations will > > never grow as fast as the growth in new open source projects > > over any measure of time. > > I applaud you and your colleagues contribution. However a > standard MUST undergo an official review and proposal process. > Otherwise it is just another run-of-the-mill project to "put > out the fires" of today's problems. > > If this process does not accept open participation from the community > for submission volume in size with the growth and expansion of the > market that we are describing, the process is inherently flawed, and > the open source community and commercial vendors will be compelled to > solve this issue. Open participation is more than welcome. I would be the first to welcome such contribution. For it seems, I have been a single open source voice in a predominantly vendor sponsored standard. However, this community has made great strides in the CPE standard. And we will continue to do so. There are a great many wonderful and committed people and organizations here within this standard. We welcome your organizations effort and contribution. I am very excited to see such progress. However you are missing the point that I am trying to convey, WE as a community must develop and propagate the standard. This is a community oriented standard. I am simply asking that we slow down a bit and review the changes that you propose. Your proposal is of such vast magnitude that we cannot simply just go forth head first and accept on a whim. Both open source and vendor entities must be presented the possibilities and subsequently review the proposal before any action can be taken. > > > > > > > > > Naming open source is a problem that will require an open > > community approval process to function. The database needs > > to be able to grow as fast as possible, allowing voluminous > > contributions from certain trusted partners. > > The speed at which the cpe database "grows" is irrelevant. The > quality of the data that it possesses is of paramount > importance. > > It is a self limiting repository that will become less relevent over > time if it cannot effectively describe the "market" of objects that it > represents. If it only describes a quality subset, then it becomes a > flawed and subjective list, and will force the commercial market to > come up with something faster, better, and able to adapt to the growth > in certain parts of the technology market and our need to universally > describe these pieces. > > > > As well, who determines what entails a "trusted partner"? How > is this status obtained? Who authorizes or denies such > claims? > > Why do we accept information from a commercial vendor as being > authoritative, yet professional open source and commercial software > researchers do not get offered this trust? This is a flawed presumption. Please take the time to review the mailinglist archives as I have previously noted. There has been great discussion over this exact topic. > > > > > It is my business to research and catalog open source > > software. My work is cited by every major analyst every > > week. Not discounting the work of your team, but I implore > > you to "qualify" certain contributors as "authoritative" in > > order to allow growth. > > I would love to engage in further discussions of an > "authoritive" entity. There has in fact been previous > discussions regarding the authoritative subject for open > source products. It should be available within the mailinglist > archives. However, maybe a review and/or re-discussion is due. > I would happily contribute to such. > > > > > > Please feel free to reach out to me privately for further discussion. > > I can be reached at [hidden email] . Thank you ernest. ;) I will most definitely place you within my address book. I am sure we will converse much more in the near future. However, I feel it is in the best interest of the community at this time to ensure that all discussions related to the topic at hand are presented in a candid and open manner for all to review, reflect and hopefully comment on in the near future. > > |
||||||||||||||||
|
Ken Lassesen-3
|
In reply to this post
by Ernest Park-2
Some javascript/style in this post has been disabled (why?)
I have the skills to do so --- and can host the webservice/website
on a non-vendor related domain (Lassesen.com OR reddwarfdogs.com ) Some basic questions: ·
What database are you using? If you can dump all of your data
as XML, then it’s a meaningless question ·
For updates to the database what is your plan? o Update
it manually via an interface on the website? o Upload
a delta as Xml? ·
Ken Lassesen, Home/Office: 360-724-3190 Fax: 952-516-5077 IM: [hidden email] http://www.linkedin.com/in/lassesen
CONFIDENTIALITY
NOTICE The information contained in this electronic message may contain
confidential and privileged information and is intended only for use by the
individual(s) or entity(ies) to whom it was addressed. Any unauthorized review,
use, disclosure, or distribution of this communication is strictly prohibited.
If you are not the intended recipient, please contact the sender by reply email
and permanently delete and destroy the original message. From: Ernest Park
[mailto:[hidden email]] I have a dictionary of a few hundred thousand OSS project names with If someone writes the web service front end, I will publish all of this Any volunteers?
On Wed, Jun 4, 2008 at 1:15 PM, Buttner, Drew <[hidden email]>
The "alias" feature is right along the lines of what we discussed
at Thanks |
||||||||||||||||
|
Ernest Park-2
|
To Ken - thanks! I will contact you for help to sort this out. I think that the CPE dictionary needs to be a real time dynamic framework that conforms to a URL resolution of a name query. Such resolution would allow information providers to "append" metadata to any record in a uniform format, and clarify that the primary reason of CPE to exist is to provide a distinct identifier to technology that can be further dfescribed, and knowing the distinct name, such information can be shared and collaborated with.
----------------------------------------------------------------------
I think this is the right idea. I will discuss hosting with Drew. I certainly have the gear and domains to put this on a vendor neurtal site, but unless this is hosted on the "sanctioned" site, it is just Ken and Ernie posting a list.
---------------------------------------------------------------------------------------
If Drew says that my site, or Ken's site, or a new, unnamed site, will be the source for EVERYTHING, then it will work. We cannot decouple the open source content as being distinct from that which has a vendor. In practice, most if not all of the commercial software has some element of open source anyway. If we get smart at naming stuff, do we want to actually name as follows -
commercial product ->contains->open source product
In practice, some commercial products are actually aliases for ana amalgum of open source components.
the above is conceptual, but stresses the realistic importance of maintaining a singular, trusted and sanctioned source. Otherwise, my data is readily available and has been under a CC license for a year.
--------------------------------------------------------------------------------------- Regarding data, I store it across 4 MySQL databases in a few dozen tables. The CPE friendly output is the result of a ten way inner join. I could generate a join table that represents ONLY those fields that we need to construct a CPE name and validate it with an artifact, like a hash, a URL, a license file, etc. An XML schemal works as well if we all agree on a simple schema for name synch, not data storage.
Granted, once you have the name, you can query my database across about 4 billion records to investigate trending, software usage, patterns, etc. By having a standard name, I can expose my web service to certain queries without just synching my DB.
From what I have seen, I may currently have the single largest CPE compliant implementation. It needs endorsement from the community of users, automatic integration into the big database, and a facility with which we can query the data.
The data is currently maintained as updates to the database. I could either push XML updates, or synch tables, or push SQL changes.
I am certain that the volume of records that I have may be fraught with inconsistencies and errors. However, the data has been copiously reviewed by a staff of 50, and is at least of quality equivalent to what we have. If we agree on a way to accept this data, perhaps we can agree on a way of accepting a "non-static" dictionary. If the dictionary were a dynamic point in time representation of our accumulated data, stored in a database or series of databases, queried by approved memebers through a secured web service, we can all collaboratively grow this data with less bottlenecks.
---------------------------------------------------------------------------------------
Trend Analysis -
10 years ago, open source reported vulnerabilities represented less than 30% of all issues
Currently, over 55%.
The linear trend will have 80% of all vulnerabilities reported against open source within 6 years.
There are over 500,000 open source software projects worldwide. There are an average of 8 recognized releases per project, so with potentially 4,000,000 releases to be named, this is a large task.
A number of analysts quoting large corporate buyers have cited a trend that will be reflected within 5 years. What was confirmed is the reality that 80% of software in use by government and enterprise will be open source based, and 50 - 80% of that will be delivered as a web service - software kept on a remote server, and only the service experienced as the result of an interaction with a web browser.
In summary, this tells us that the importance that we currently put on vendor supported names will have much less relevance in the real use of technology assets over the next half decade. If we don't embrace an understanding of the real inpact of open source within our computing world, then CPE will continue to be primarily a naming system for commercial apps and those things that NVD finds.
On Fri, Jun 6, 2008 at 1:48 PM, Ken Lassesen <[hidden email]> wrote:
|
||||||||||||||||
|
Thomas R. Jones
|
Some javascript/style in this post has been disabled (why?)
Hello ernest, I have a few reservations. First of all, I am one of a small minority of open source researchers and contributors to cpe. So I would like to extend a welcome to you and your colleagues. Second, the vast amount of contributions is almost disconcerning. I am sure yourself and your colleagues have worked diligently to provide a much needed service to this community. And I for one thank you! However, what you propose is very difficult to envision on such a scale. No one in the community, that I know of, has had an opprtunity to evaluate the contributions proposed. This should be a pre-requisite before anyone jumps on board. A view of the database structure is vital. We, as you surely understand, are all putting valuable time into the standard. And to facilitate further development within your proposal; we must be able to ensure that the "project" is not flawed within design or structure constraints. Furthermore, there should be a community discussion of the stewardship of such a project. The notion that "EVERYTHING" be authoritative through this project is ambitious but wholly flawed. There should be an overwheling discussion of such aspects and subsequent requests before such proposals may be presented. I look forward to seeing and hearing of the magnitude of contributions that you and your colleagues may provide. Thank you once again. Sent from my iPhone
|
||||||||||||||||
|
Andrew Buttner
|
In reply to this post
by Ernest Park-2
All,
I think this work is a huge help for CPE and will get us much further down the road than where we are today. But I'd like to scale back a little of what I think I am reading. CPE as a project is focused on the naming specification and hosting the Official CPE Dictionary. This dictionary should be focused on providing a list of all known CPE names, similar in scope to the CVE list. Having these names available to the community will enable external application to stand up and support added metadata. What I think I am reading fits into two very different jobs going forward. First is the submission to the Official CPE Dictionary of the CPE Names for the open source platforms you have knowledge about. Second is work on an application outside of CPE that provides a database (keyed off of CPE Name) of appended metadata. Is this understanding correct? If so, I would really like CPE as a project to focus on the first step. Agree? Thanks! Drew >-----Original Message----- >From: Ernest Park [mailto:[hidden email]] >Sent: Friday, June 06, 2008 2:20 PM >To: cpe-discussion-list CPE Community Forum >Subject: Re: [CPE-DISCUSSION-LIST] OSS CPEs > >To Ken - thanks! I will contact you for help to sort this out. I think >that the CPE dictionary needs to be a real time dynamic framework that >conforms to a URL resolution of a name query. Such resolution would >allow information providers to "append" metadata to any record in a >uniform format, and clarify that the primary reason of CPE to exist is >to provide a distinct identifier to technology that can be further >dfescribed, and knowing the distinct name, such information can be >shared and collaborated with. > >---------------------------------------------------------------------- > >I think this is the right idea. I will discuss hosting with Drew. I >certainly have the gear and domains to put this on a vendor neurtal >site, but unless this is hosted on the "sanctioned" site, it is just >and Ernie posting a list. > > >---------------------------------------------------------------------- -- >--------------- > >If Drew says that my site, or Ken's site, or a new, unnamed site, will >be the source for EVERYTHING, then it will work. We cannot decouple the >open source content as being distinct from that which has a vendor. In >practice, most if not all of the commercial software has some element of >open source anyway. If we get smart at naming stuff, do we want to >actually name as follows - > >commercial product ->contains->open source product > >In practice, some commercial products are actually aliases for ana >amalgum of open source components. > > > >the above is conceptual, but stresses the realistic importance of >maintaining a singular, trusted and sanctioned source. Otherwise, my >data is readily available and has been under a CC license for a year. > > >---------------------------------------------------------------------- >--------------- > >Regarding data, I store it across 4 MySQL databases in a few dozen >tables. The CPE friendly output is the result of a ten way inner join. I >could generate a join table that represents ONLY those fields that we >need to construct a CPE name and validate it with an artifact, like a >hash, a URL, a license file, etc. An XML schemal works as well if we all >agree on a simple schema for name synch, not data storage. > >Granted, once you have the name, you can query my database across about >4 billion records to investigate trending, software usage, patterns, >etc. By having a standard name, I can expose my web service to certain >queries without just synching my DB. > > >From what I have seen, I may currently have the single largest CPE >compliant implementation. It needs endorsement from the community of >users, automatic integration into the big database, and a facility with >which we can query the data. > >The data is currently maintained as updates to the database. I could >either push XML updates, or synch tables, or push SQL changes. > > >I am certain that the volume of records that I have may be fraught with >inconsistencies and errors. However, the data has been copiously >reviewed by a staff of 50, and is at least of quality equivalent to what >we have. If we agree on a way to accept this data, perhaps we can agree >on a way of accepting a "non-static" dictionary. If the dictionary were >a dynamic point in time representation of our accumulated data, stored >in a database or series of databases, queried by approved memebers >through a secured web service, we can all collaboratively grow this data >with less bottlenecks. > > >---------------------------------------------------------------------- -- >--------------- > >Trend Analysis - > >10 years ago, open source reported vulnerabilities represented less than >30% of all issues >Currently, over 55%. > >The linear trend will have 80% of all vulnerabilities reported against >open source within 6 years. > > >http://gpl3.blogspot.com/2008/03/gpl-project-watch-list-for-week-of- >0328.html > > >There are over 500,000 open source software projects worldwide. There >are an average of 8 recognized releases per project, so with >4,000,000 releases to be named, this is a large task. > >A number of analysts quoting large corporate buyers have cited a trend >that will be reflected within 5 years. What was confirmed is the reality >that 80% of software in use by government and enterprise will be open >source based, and 50 - 80% of that will be delivered as a web service - >software kept on a remote server, and only the service experienced as >the result of an interaction with a web browser. > >In summary, this tells us that the importance that we currently put on >vendor supported names will have much less relevance in the real use of >technology assets over the next half decade. If we don't embrace an >understanding of the real inpact of open source within our computing >world, then CPE will continue to be primarily a naming system for >commercial apps and those things that NVD finds. > > > >On Fri, Jun 6, 2008 at 1:48 PM, Ken Lassesen ><[hidden email]> wrote: > > > I have the skills to do so --- and can host the >on a non-vendor related domain (Lassesen.com OR reddwarfdogs.com ><http://reddwarfdogs.com/> ) > > > > Some basic questions: > > * What database are you using? If you can dump all of >your data as XML, then it's a meaningless question > > * For updates to the database what is your plan? > > o Update it manually via an interface on the website? > > o Upload a delta as Xml? > > * > > > > Ken Lassesen, > > Home/Office: 360-724-3190 Fax: 952-516-5077 > Cell: 360-509-2402 Skype: Ken.Lassesen > > IM: [hidden email] http://www.linkedin.com/in/lassesen > > CONFIDENTIALITY NOTICE > > The information contained in this electronic message may >confidential and privileged information and is intended only for use by >the individual(s) or entity(ies) to whom it was addressed. Any >unauthorized review, use, disclosure, or distribution of this >communication is strictly prohibited. If you are not the intended >recipient, please contact the sender by reply email and permanently >delete and destroy the original message. > > > > From: Ernest Park [mailto:[hidden email]] > Sent: Thursday, June 05, 2008 1:05 PM > To: [hidden email] > Subject: [CPE-DISCUSSION-LIST] OSS CPEs > > > > I have a dictionary of a few hundred thousand OSS project names >with > metadata and releases. > > If someone writes the web service front end, I will publish all >this > to a database available to the service via the web. Basically, I >have > most of open source software in CPE format. > > Any volunteers? > > > Ernie > > > > On Wed, Jun 4, 2008 at 1:15 PM, Buttner, Drew > wrote: > > > I like your approach here and this is a perfect use of CPE. You >have > created a schema for your database that uses the CPE Name to id > platform information. This will theoretically allow others to >interact > with your database using a CPE Name, or will allow you to interact >with > other data sources via CPE Name. > > The "alias" feature is right along the lines of what we discussed >at > Developer Days. Nice! > > Thanks > Drew > |
||||||||||||||||
|
Ernest Park-2
|
In reply to this post
by Thomas R. Jones
Hi Tom, notes inline.
On Fri, Jun 6, 2008 at 2:51 PM, Thomas R. Jones <[hidden email]> wrote:
Why do you need to view the data?? CPE is not a database or a schema. It is a string identifier format for distinct technology elements - nothing more. The idea at the end of the day is to provide a dictionary of names. The data underlying that is irrelevant, may be proprietary, and may have nothing to do with defining a name. I continually see the problem of CPE that we all fall into the mistake of making it something more than it is. CPE is a phone book - a set of distinct and human friendly identifiers for technology assets, nothing more.
If I can provide you with Vendor, Applicatioon, Title, Release, URL, maybe an MD5, as part of a query, then it is the result set you should be looking at.
Also, if I can provide something that nobody else has provided, why not use it until it is contested? If not, the database is perpetually bottlenecked by a subjective approval process that due to realistic limitations will never grow as fast as the growth in new open source projects over any measure of time.
Naming open source is a problem that will require an open community approval process to function. The database needs to be able to grow as fast as possible, allowing voluminous contributions from certain trusted partners. It is my business to research and catalog open source software. My work is cited by every major analyst every week. Not discounting the work of your team, but I implore you to "qualify" certain contributors as "authoritative" in order to allow growth. Further, you should specify the minimum acceptable data required to satisfy a valid entry - like a naming API.
My experience is that the community as a whole does a good job of cleaning and managing a system. By publishing it all, but in a community maintainable "wiki" format for name associations, the community can resolve the dictionary dynamically, without impeding its growth and immediate value.
|
||||||||||||||||
|
Ernest Park-2
|
In reply to this post
by Andrew Buttner
Yes - 1. The contribution from acknowledged "authoritative" sources for 2. Value add extended metadata is nice, but clearly beyond the scope of Even if some contributed content is not perfect, it is better than On Fri, Jun 6, 2008 at 3:18 PM, Buttner, Drew <[hidden email]> wrote:
All, |
||||||||||||||||
|
Thomas R. Jones
|
In reply to this post
by Ernest Park-2
Some javascript/style in this post has been disabled (why?)
Responses inline. Sent from my iPhone
As an analogy, would you buy a car if you not only could not see it but also not drive it? There are many many reasons that any one of us may want to obtain a subset of data.
I could easily pose a few questions to you regarding the database and informational manipulation if you would prefer.
As well, who determines what entails a "trusted partner"? How is this status obtained? Who authorizes or denies such claims?
|
||||||||||||||||
|
Ernest Park-2
|
Please keep in mind that I am deeply involved with managing and maintaining distinct records for millions of releases and billions of files and related components. I believe that what CPE represents is incredibly important.
Comments below -
On Fri, Jun 6, 2008 at 3:56 PM, Thomas R. Jones <[hidden email]> wrote:
The analogy is incorrect. The CPE, despite the discussions here, is intended by its own definition to be an identifier, a URI - like string. In your analogy, this merely means that if I were buying a car, I would want a license plate that distinctly identified my car. Any additional data would be stored in my car, separate from that record with the unique identifier.
The problem when we make CPE into a complex database is that we blur so much the lines of what it is and is not that we dissuade contributions and usage by the community.
The CPE is a name that points to something, and with an inferred relational hierarchy in the name.
If I want to deploy a database that supports CPE 1.x query, you do NOT need to qualify the database. If I offer to provide, or keep secret, anything beyond those elements which distinctly confirm a valid name and its association with a distinct technology component, that should be sufficient.
When we try to make CPE something it is not, it will never be what it can be. If it is merely a naming identifier, it becomes a unification point for data from multiple providers. I could allow software companies to query my data. They may invite me to query theirs. The common unification is the name.
Nothing should matter to CPE beyond a valid name and association to a distinct element no more than the DMW cares about what fuel you run in the car.
From the CPE homepage (http://cpe.mitre.org)
There is not reference to SQL structure in the definition of CPE, nor a reference implementation. CPE is NOT a database or a data storage system of any kind. CPE does not denote a schema, but such information can be stored in a number of formats while still containing CPE compliant information.
I am sure Symantec and McAfee store proprietary information along with having those components that support CPE in their data repositories, but they would not more open these databases to review than I will. If CPE is a name identifier constrained by elements, if I can provide the elements, perhaps:
vendor, URL, application, app home page, release, release file name and URL, MD5 for release file,
any string containing components from above is an identifier.
What are the fields required in order to accept a third party contribution of a CPE name?
If this process does not accept open participation from the community for submission volume in size with the growth and expansion of the market that we are describing, the process is inherently flawed, and the open source community and commercial vendors will be compelled to solve this issue.
It is a self limiting repository that will become less relevent over time if it cannot effectively describe the "market" of objects that it represents. If it only describes a quality subset, then it becomes a flawed and subjective list, and will force the commercial market to come up with something faster, better, and able to adapt to the growth in certain parts of the technology market and our need to universally describe these pieces.
Why do we accept information from a commercial vendor as being authoritative, yet professional open source and commercial software researchers do not get offered this trust?
Please feel free to reach out to me privately for further discussion.
I can be reached at [hidden email] .
|
||||||||||||||||
|
Ernest Park-2
|
In reply to this post
by Thomas R. Jones
Hi Tom,
Exactly what data elements are required in order to satisfactorily deliver a contribution for a single CPE name?
What else?
I will send you fully qualified name strings in a data format specified, or I will populate a SQL database if you can provide a standardized table format.
Keep in mind, part of the data is proprietary. If I extract the data and decouple it from proprietary information in a way that satisfies CPE contribution requirements, I can do that.
Tom, what is missing is that we want to look under the hood without defining what is being looked for. Instead, give me a clear and definite format - comma separated, SQL, etc, and I will send a sample of compliant data for review.
Ernie
On Fri, Jun 6, 2008 at 12:42 PM, Thomas R. Jones <[hidden email]> wrote:
|
||||||||||||||||
| Free Embeddable Forum Powered by Nabble | Help |