ppy/chat-transcript_unedited_20120308a.txt ------------- Chat transcript from room: summit_20120308 2012-03-08 GMT-08:00 ------------- [09:12] PeterYim: Welcome to the = OntologySummit2012: Session-09, Thursday 2012-03-08 = Summit Theme: OntologySummit2012: "Ontology for Big Systems" Track (4) Title: Large-Scale Domain Applications Session Topic: Large-scale domain applications – Biomedical, earth & environmental science & engineering Session Chairs: Dr. TrishWhetzel (NCBO; Stanford) and Dr. SteveRay (CMU) Panelists: * Mr. DavidPrice (TopQuadrant) - "Experiences from a Large Scale Ontology-Based Application Development for Oil Platforms" * Dr. MichaelKellen (Sage Bionetworks) - "Collaborative Clinical Genomics Data Analysis with Sage Bionetworks Synapse" * Dr. DamianGessler (iPlant Collaborative) & Dr. BlazejBulka (Clark & Parsia) - "The iPlant Collaborative Semantic Web Platform: Using OWL and SSWAP (Simple Semantic Web Architecture and Protocol) for On-Demand Semantic Pipelines" * Dr. IlyaZaslavsky (San Diego Supercomputing Center) - "Managing observation semantics in CUAHSI Hydrologic Information System" * Dr. LinePouchard (Oak Ridge National Laboratory) - "Linked Science as a producer and consumer of big data in the Earth Sciences" Session page: http://ontolog.cim3.net/cgi-bin/wiki.pl?ConferenceCall_2012_03_08 Mute control: *7 to un-mute ... *6 to mute (please make sure your own phone is not muted as well) Can't find Skype Dial pad? ... it's under the "Call" dropdown menu as "Show Dial pad" . == Proceedings: == . [09:25] anonymous morphed into DamianGessler [09:27] anonymous morphed into ChristopherSpottiswoode [09:29] anonymous1 morphed into DavidFlater [09:29] anonymous morphed into Jim Schoening [09:30] anonymous morphed into DavidPrice [09:31] anonymous morphed into Michael Kellen [09:32] anonymous morphed into JimRhyne [09:32] TerryLongstreth: I hear you. My mic isn't working [09:32] anonymous morphed into ScottHills [09:33] anonymous morphed into Byron Davies [09:35] ScottHills1 morphed into ScottHills [09:36] Ilya Zaslavsky: my mic is not muted - may be on your side? [09:36] anonymous morphed into PavithraKenjige [09:36] LinePouchard: testing [09:41] anonymous morphed into ElisaKendall [09:44] MikeBennett: Does the reported data include SCADA data from on-platform systems such as Fire and Gas, ESD and so on? Just curious. [09:46] PeterYim: @9:45am PST - DavidPrice is on slide#8 now ... [09:47] PeterYim: @9:46am PST - DavidPrice is on slide#10 now ... [09:47] Ernani Santos morphed into ErnaniSantos [09:48] SteveRay: Slide 11 [09:48] anonymous morphed into thomasGetgood [09:48] thomasGetgood morphed into ThomasGetgood [09:49] DougFoxvog: What is "warm fallover"? [09:49] SteveRay: @PeterYim: We are on slide 12 [09:50] SteveRay: Slide 14 [09:50] Harold Boley: Re Slide 10: Would the 300 million triples be structured/modularized in some way, e.g. as named graphs? [09:53] MichaelGruninger: @David -- you say that it is hard to test an ontology. Can you share any of your ontologies which we can test? [09:55] DavidPrice: The date is about Drilling and Production. Some of it is measurement, but not really SCADA. [09:56] DavidPrice: The triples are managed in graphs based on Licenses for Fields in the sea. This allows us to control the set of data over which aggregations and queries are allowed and also allows us to use these graphs as the basis for access control and secritry. [09:56] MikeBennett: @David thanks. Just seeing an opportunity there. [09:56] anonymous morphed into DWiz [09:56] AmandaVizedom: @David - to what extend do you think the "vague/ambiguous" character of the ontologies or their documentation (your slide 17) is a result of the application type being comparatively shallow (that is, not too much axiomatization or reasoning)? [09:57] DavidPrice: I cannot yet share the ontologies, but they will be made public eventually ... probably 3Q2012. The project is not yet complete, the Drilling stuff is going into production use this month. The Production use will follow between now and June/July. [09:57] PeterYim: == MikeKellen presenting ... [09:58] DWiz: David: Do you have a mandantory XDS? Otherwise, how to you generate a true ontolgoy from XML? [09:59] DavidPrice: I attribute vague-ness entirely to projects running out of money - in fact I have heard that from the people who created some of the background ontologies/referenece data we are re-using. [09:59] AmandaVizedom: @David - Have you tried developing test beds and test apps with which to evaluate the ontologies? Given the well-developed application context, this seems like an approach you could use to ontology testing & proofing. [10:00] AmandaVizedom: @David - Ah, yes, I see what you mean. Not unlike undocumented code from abandoned software projects, then. [10:00] DavidPrice: Our 'XSD Proxy Ontology' capability does not try to create a 'true ontology' - as the name suggests it's a proxy for the XSD that allow us to import an XML data file into the workspace and do SPARQL over it directly through the proxy ontology-based triples. [10:02] DavidPrice: We are just now getting into the detailed use of proper software testing tools to try to do a better job wrt testing our ontology. We are working on large, complete test cases, test scenarios, automated tests using REST-like services, etc. to make testing the ontology fit into the more traditional testing apparatus used by our software team. [10:02] DavidPrice: FWIW we use Github for the source code/ontology management and SpiraTest for the testing tool. [10:03] DavidPrice: TopBraid Composer is an eclipse-based tool and that's what we use to develop ontology and SPIN/SPARQL/SWP. [10:03] DavidPrice: I have to leave for a while but will respond to any other questions in 30 mins or so. Thanks! [10:04] SteveRay: Thanks David. Fascinating talk. [10:04] AmandaVizedom: @David - I can't remember who it was, now, but one presenter earlier in the summit discussed an approach in which they used "real" (sound, computational) ontologies but also intermediate artifacts that are ontological in format (OWL) but are not used (or usable) as ontologies; rather, they are fairly direct models of the data source's data model. This is similar to what you describe, yes? [10:07] anonymous morphed into GiulianoLancioni [10:08] SimonSpero: @Michael Kellen: SKOS is for controlled vocabularies; SKOS concepts are "Subjects", not the things that subjects are about" [10:08] SimonSpero: @Michael: is that the intended semantics [10:09] Michael Kellen: Yes, we aren't trying to create a model of the relationships among domain objects that we can reason about [10:10] Michael Kellen: We are simply trying to consistently structure information to help scientists pull together appropriate data so that they can reason about it [10:11] SimonSpero: @Michael: as long as it's just for guiding people to data sets, that's a safe use [10:12] Michael Kellen: There are other projects in life sciences trying to use the richer semantics to actually model the domain [10:12] PeterYim: == DamianGessler presenting ... [10:12] Michael Kellen: The problem they hit is that there are so many unknowns in our domain that this is hard to do [10:15] SimonSpero: @Michael: right - it's just important to keep the distinctions clear so that a KOS isn't used directly an ontology [10:19] MikeBennett: @Simon @Michael we have a labeling problem: if we get into the habit of referring to everything that is in triple-store formats as "An ontology" then we need a new word for ontologies. Syntax is not semantics... [10:23] BobbinTeegarden: @Damian where is are the transformation decisions made between services in the pipeline? [10:25] PeterYim: == IlyaZaslavsky presenting ... [10:26] DougFoxvog: How do you deal when a large number of possible services are available at one point? E.g., there may be hundreds of services available for converting images from Format A to Format B. [10:27] anonymous morphed into CarlosRueda [10:28] PeterYim: @DWiz ... may we have your real name, please? [10:29] DamianGessler: @BobbinTeegarden third-parties hosting SSWAP semantic web services run a servlet that we provide from our Software Development Kit. This servlet handles the semantics and ensures that both input and output follow the protocol. So at the pipeline end, we can look at the outputs and required inputs, and orchestrate the interaction. Third-party data need not pass through us: it goes directly from the upstream service to the downstream service. [10:31] LinePouchard: @Peter: I sent you a new set of slides that contain page numbers. Would you have time to upload them? [10:31] BobbinTeegarden: @Damian Thank you, grand. [10:31] DamianGessler: @DougFoxvog The key is in service choice prioritization--just like Google prioritizes web pages on its search results page. But here, we are not nearly as sophisticated as Google and currently use a very simple algorithm. We'll put focus here down the road a little. [10:32] PeterYim: @10:32 PST - slide#13 now ... [10:32] AmandaVizedom: @Damian - There are many similarities between the iPlant approach you describe and the Semantic SOA - service discovery approach being developed by USAF. I think that the approach used in the DoD EIW is also strongly similar (perhaps DWiz - DennisWisnosky - will comment). In each case, ontologies are primarily being used to provide semantic description, model, or wrapper for (mostly natively non-semantic) data services, and ontological reasoning and search technologies are used to enable service discovery given user needs. Your statement about ontology alignment, however, stands out. I understand service matching operationally and ephemerally based on reasoning over lightly-aligned ontologies. But you seem to be saying something else, that the alignment of the locally-developed ontologies in which the services are described is not manual, not static, and not axiomatic. Can you say something more about how you align, or connect, or reason ! across such ontologies without any prior / stable alignment points? [10:33] PeterYim: @10:33 PST - slide#14 now ... [10:35] DavidPrice: @Amanda - wrt direct models of the data source ... yes, we call that a proxy ontology. [10:36] DamianGessler: @AmandaVizedom The key is that we are not aligning ontologies on *data* per se; we let services make the mapping statements (e.g., the (possibly complex) data they take in and the (possibly complex) data they give back. So we "simply" need to determine and operate on subsumption questions: can this service operate on my data and return what I want? This is essentially a dynamic, operational alignment question. [10:37] SimonSpero: @Ilya: if the hierarchies are genuine hierarchies - that is, subordinate terms always entail the superordinate term, then you have traditional Knowledge Organization System semantics [10:37] PeterYim: @Line - ack .... will try now [10:38] SteveRay: Slide 20 [10:38] DavidPrice: @Amanda For ontologies of a domain, we usually follow Leo Orbst use of the term 'Strong ontologies' meaning they are about the domain of interest. [10:38] AmandaVizedom: @David - Thanks. I think a similar approach is used for the DoD EIW. Independently, we discussed using something like "proxy ontologies" on the USAF project. Though non-technical factors/authorities mooted that discussion, I thought (think) that it was promising. I sense a pattern. ;-) [10:39] ElisaKendall: @Ilya, have you used any vocabulary such as ISO 1087 to define relationships such as synonomy, polysemy, etc.? Just wondering ... [10:39] DavidPrice: @Amanda For us the main thing is to use a semantic language like SPARQL to define the transforms between data sources and targets and so everything must be presented as at least RDF, and preferably as an OWL ontology. [10:40] DavidPrice: We also have ontologies of SPARQL, etc. which we often call 'system ontologies' as in software system ... just to confuse things even more:-) [10:41] MatthewWest: Sorry I have to go. [10:41] AmandaVizedom: @David - though I did and do think there needs to be some explicit (meta)data on/in the proxy ontology to make clear that it is not a full ontology - that is, under the formal semantics of the language used, it would not likely be computationally sound. [10:41] BobbinTeegarden: @Ilya what did you use for the visualizations, and how well received were they? [10:42] MichaelGruninger: my copy has page numbers [10:42] GiulianoLancioni: mine too [10:42] JimRhyne: So does mine. [10:42] AmandaVizedom: Copy just downloaded from refreshed call page does have the page numbers. [10:43] MikeBennett: @Amanda et al - we need a metaontology. Or at least an ISO 1087 compliant terminology and vocabulary setting out rather less messy uses of words like "Ontology" - people think they are listening about the same thing when someone is talking about a different thing - dog food much? [10:43] DougFoxvog: The link to the slides is the same. Re-download & you'll get the page numbers [10:43] SteveRay: Strange, I just did this and do not get slide numbers... [10:44] DavidPrice: We have a taxonomy of ontology-related artifacts we use (and modify as required in various projects). It can be used as metadata but we also use it in the base of the URIs for things and even in the name of graphs so it's visible to the ontologies/software developer. [10:44] Ilya Zaslavsky: @BobbinTeegarden: Inxight startree, semantic wiki, also recently Silverlight. Hydrologists were comfortable using Inxight startree in the tagging application [10:45] DougFoxvog: SLIDE 5 now [10:46] DavidPrice: We actually often use the phrase 'schema' when we mean the 'ontology' in many projects because customers are more familiar with that terminology. We then have graphs that are 'transforms', that are 'swp', that are 'spin', that are 'testcase', etc. as you would in a typical software development team. [10:47] AmandaVizedom: @Mike - Yup. And we've probably gotten far enough along since we started saying that that we could actually build one, identifying main types of artifact. We would then run into the problem of every domain, in that we would disagree over what to call the various types. Were we then to get over the names problem (via multiple, contextual labels and/or other techniques), we'd then have a very useful product *and* a useful methodological example! ;-) [10:49] Ilya Zaslavsky: @ElisaKendall: no, we haven't. This sounds interesting. The current plan is to use SKOS [10:52] Ilya Zaslavsky: @SimonSpero. No, in many cases these are not trees. We try to present them as trees where possible [10:54] ElisaKendall: @Ilya: I have a draft ontology for ISO 1087 that we're planning to standardize at OMG, together with ISO TC 37. I'm guessing that we will be publishing a draft sometime this spring/summer, with one of the goals being to use it to assist in mapping SBVR vocabularies to ODM/OWL. If you'd like to chat more about this offline, please feel free to contact me directly, at ekendall at thematix.com. [10:57] DavidPrice: @Amanda We have published some work on a metadata for ontology at http://linkedmodel.org/doc/vaem/1.2/ Vocabulary for Essential Metadata ... Ralph Hodgson has pushed that effort. [10:59] Harold Boley: @Elisa, do you use SBVR rules for mapping? [11:01] Ilya Zaslavsky: @ElisaKendall, thanks Elisa, I'd be interested [11:01] anonymous morphed into NancyWiegand [11:03] DougFoxvog: @Line: attaching multiple textual terms to individual ontology terms would assist in search, not relying on searching only on the ontology term's name. [11:05] ElisaKendall: @Harold, not so far, but Mark Linehan (IBM) has been doing some work in this area for a Date Time vocabulary we're creating at OMG. The current alpha (or maybe beta) spec is available at http://www.omg.org/spec/DTV/1.0/Beta1/, but doesn't include much of the OWL work we've been doing more recently in finalization. [11:07] ElisaKendall: @Harold, you might take a look even at the Beta spec for the Date Time effort, as it does include some OCL and CLIF statements for the SBVR definitions, which we're still refining, but could give you a better sense of what the SBVR actually is intended to say :). [11:12] Harold Boley: @Elisa, thanks, I just opened the huge http://www.omg.org/spec/DTV/1.0/Beta1/ PDF. [11:14] FrankOlken1: @ElisaKendall The URL you posted for DTV yields a 404 (page missing) error ... [11:15] ElisaKendall: @Harold, Sorry :), but hopefully you'll find it useful. When we've added the OWL it will only get bigger ... as you might imagine, but I think the result could provide a next generation OWL Time ontology, and includes a number of business oriented definitions. [11:15] MikeBennett: Is it the mental model of hydraulogists that doesn't map to Protege, or that fact that to use Protege you first need a mental model of ontology, since Protege in no way presents a visual or other model of ontology to the person looking at it. [11:16] FrankOlken1: @ElisaKendall The DTV link that Harold Boley posed seems to work. [11:16] SimonSpero: Testing; Competency or performance? :-) [11:16] ElisaKendall: @Frank, hmmm... for me it downloads, so I'm glad you were able to get to it. [11:17] LinePouchard: DataONE is offering an Internship program for Summer 2012 at http://www.dataone.org/internships. I am co-mentoring for Project #7 to continue the work described today. In particular, this work needs to extend parts of the SWEET ontologies w.r.t soil science, and a candidate with domain knowledge would be ideal. The intern works remotely from their own institution for ten weeks. The deadline is March 12. Please share with your students. [11:17] Harold Boley: @Frank, my quote of the URL omitted the comma, wrongly assumed to be part of the URL by the chat software. [11:17] MikeBennett: @Fank et al - there is a rogue comma in the original URL. [11:18] ElisaKendall: @Frank, the units/dimensions part of the model is limited, fyi, but the SysML effort for quantities and units and this units model are in the process of being aligned now at OMG, with the SysML version being more comprehensive, as you might imagine/hope. [11:19] MikeBennett: Expressitivity versus what is expressed - these are two distinct matters. [11:21] LarryLefkowitz: And formalism vs content is another distinction. Having a great grammar but a small vocabulary is certainly going to limit expressivity. Yes, in theory you can create new vocabulary on the fly, but that could easily overtake the initial modeling task. [11:22] DavidFlater: @Steve What happened with OASIS Quantities & Units of Measure Ontology? [11:22] MikeBennett: Exactly. you can have a very expressive model of data, but it's still a model of data. Or you can have a more or less expresive model of real things inthe problem domain, and it's an ontology. [11:23] Ilya Zaslavsky: @LinePouchard: do you know if there are already soil vocabularies available in some form? Is there any relationship with SoilML? CZO project would be interested in this. [11:25] PeterYim: @ALL - the memory and the work of Dr. Robert Raskin (http://ontolog.cim3.net/cgi-bin/wiki.pl?RobRaskin ) who passed away last Friday, will be with us. Rob was the PI of the SWEET Ontology (Semantic Web for Earth and Environmental Terminology) project, and an active contributor to this community [11:25] LinePouchard: @Ilya: yes, I had talked to Nancy W. about this, and last Fall, SoilML was not released yes. [11:25] Harold Boley: @Elisa, yes it looks very exhaustive. Can I mentioned it in the OASIS TC on LegalRuleML? [11:26] DamianGessler: Thank you [11:26] GaryBergCross: bye all [11:26] DWiz3: Steve or whom ever: I dont know how i am now 4 instances. Pls eliminate 3 of me. [11:26] PeterYim: great session! [11:26] PeterYim: -- session ended: 11:26am PST -- [11:26] SteveRay: Thanks everybody for making the session stimulating. [11:33] ElisaKendall: @Harold, [11:33] DavidPrice: There is also the QUDT ontology from NASA Ames that was input to the OASIA QUOMOS activity ... http://qudt.org/ [11:33] ElisaKendall: Yes, of course, and please give people the link to the specification. Our next iteration will have more OWL, and ultimately we will have a set of OWL ontologies corresponding to the SBVR [11:34] ElisaKendall: @David, Folks from OMG who are working on the SysML QUDV effort are looking at alignment with that as well, fyi, but I don't know much about the differences. I have heard there are some, which they are working through, and that JPL is involved along with ESA. -------------