WO2007139999A2 - ENTRETIEN collaboratiF de connaissances À partir d'un texte et de rÉsumÉs - Google Patents
ENTRETIEN collaboratiF de connaissances À partir d'un texte et de rÉsumÉs Download PDFInfo
- Publication number
- WO2007139999A2 WO2007139999A2 PCT/US2007/012624 US2007012624W WO2007139999A2 WO 2007139999 A2 WO2007139999 A2 WO 2007139999A2 US 2007012624 W US2007012624 W US 2007012624W WO 2007139999 A2 WO2007139999 A2 WO 2007139999A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- user
- knowledge
- source
- curation
- information
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/34—Browsing; Visualisation therefor
- G06F16/345—Summarisation for human users
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B50/00—ICT programming tools or database systems specially adapted for bioinformatics
- G16B50/30—Data warehousing; Computing architectures
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H70/00—ICT specially adapted for the handling or processing of medical references
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B50/00—ICT programming tools or database systems specially adapted for bioinformatics
Definitions
- the present invention relates to collaborative reference curation.
- biomedical articles In addition to the data that exists in various public and private databases, there is a much larger and ever increasing amount of information buried in existing biomedical articles. It is beyond human ability to read the various relevant articles and recall relevant findings of these articles for further research.
- the sheer volume of the articles and their constant growth makes it prohibitively expensive to employ (and monetarily compensate) human curators to read through the articles and cull the useful information buried in them.
- the volume of existing biomedical articles is huge and it grows day by day. From 1994 to 2004, close to 3 million biomedical articles were published by US and European researchers alone. Added to the approximately 15 million abstracts already in PubMed, this represents over 800 new articles per day and a myriad of individual new facts to survey for information relevant to a particular research question.
- HPRD Human Protein Reference Database, available at ⁇ http ://www.hprd.org/>
- a Web-based system operates to facilitate collaborative curation of biomedical knowledge from biomedical text and abstracts. More generally, the system may take other, non-web-based forms, and may also be used for applications other than those directly relating to biomedical knowledge or research.
- a Web-based software system is configured on a computer that is connected to a public network, such as the Internet.
- a database or repository site such as PubMed
- PubMed When a database or repository site such as PubMed is accessed through the computer, software will open a side-by-side frame interface.
- the system When a particular abstract or article is explored, the system will connect to a server and attempt to locate existing curated knowledge regarding that article or abstract. Any retrieved knowledge may be displayed in the new frame, and the user may be permitted to vote about the correctness of each displayed knowledge element. Further, the user may be permitted to add new knowledge elements and/or also revise existing knowledge elements.
- a registration process may be required and users may be assigned categories and their permission levels modified according to user categories. For example, a professor may be allowed to create new knowledge schemas, while undergraduate students may only be allowed to browse the knowledge. To motivate researchers to participate, the system may use automatic text extraction programs to extract knowledge for all articles as a bootstrap. Thus an initial visit to an article may not result in a blank knowledge-base. Rather the user may see automatically extracted knowledge and will be able to vote on its relevance and/or value. The voting is preferred, as the automatic extraction systems are not necessarily accurate, and even the best automatic extraction systems do not have perfect recall or precision.
- the system may be an open system accessible for free to the research community or the public. However, other fee based and controlled systems are also contemplated.
- Figure 1 is a block diagram illustrating the functional architecture of a collaborative curation system.
- Figure 2 is a block diagram illustrating the functional architecture of a collaborative curation system.
- Figure 3 is a block diagram illustrating another architecture of a collaborative curation system.
- Figure 4 is a flow diagram illustrating operations performed in initiating use of a collaborative curation system.
- Figure 5 is a flow diagram illustrating operations performed in using a collaborative curation system as a research tool.
- Figure 6 is a wireframe illustration of a user interface displayed at a client computer, including a reference interface and a Web band.
- Figure 7 is a wireframe illustration of the Web band portion of the user interface, showing the tabular display of curation information.
- Figure 8a is a wireframe illustration of a user interface for rating confidence levels in an element of curation information.
- Figure 8b is a wireframe illustration of a display in which users are provided with a history of confidence level ratings.
- Figure 9 is a flow diagram illustrating steps taken in a collaborative curation method.
- Figure 10 is a block diagram illustrating the functional architecture of a collaborative curation system.
- Figure 11 is a schematic wireframe illustration of a user interface displayed at a client computer, including a reference interface and a Web band.
- FIG. 1 is a block diagram illustrating an architecture of a collaborative curation system in accordance with a preferred embodiment.
- a packet network 102 such as the Internet, interconnects a first source 106, a second source 108, and a client computer, such as a personal computer 104.
- the personal computer 104 is preferably equipped with an Internet browser or other software that can be used as a research tool for seeking out references, such as publications stored electronically in an on-line database.
- the client computer is equipped with watching software that operates to determine if and/or when the client computer is being used to seek out references.
- this watching software is in the form of a browser extension; in this case, the watching software is referred to as a browser helper.
- the watching software may be an element of software external to the browser, or it may be a part of a stand-alone software research system.
- the browser helper watches network activity at the client computer 104 to determine when the browser is being used to seek out references over the network.
- the browser software provides the user interface for control of the watching software.
- the first source 106 contains reference information such as publication text, keywords, titles, citation information, and/or abstracts, for instance, and the first source makes at least some of this information available over a network. (Although such availability may require a subscription or be otherwise subject to limited access rights.) These references may be — but are not necessarily — research articles, such as biomedical research articles.
- the first source can be referred to as a reference server.
- server as used herein is intended not to be limited to implementations on a single computer, but rather encompasses other implementations, such as those on multiple computers, which may be distributed at different locations.
- the second source 108 contains knowledge elements associated with references in the first source 106, and the second source makes at least some of this information available over a network.
- the knowledge elements may include, for each research article, information identifying the various biological entities, diseases, biological processes — and interactions therebetween — discussed in the article.
- Biological entities may include proteins, genes, and organs, among others.
- the second source includes information that can be used to organize information in the first source, it can be referred to as a curation server, which again may be implemented on one or more than one computer, which may or may not be separate from the reference server.
- the reference server 106 and the curation server 108 are shown as separate elements in Figure 1 , it is contemplated that the two sources may be configured on the same physical hardware and even within the same database.
- the reference server 106 may be the PubMed system, and the curation server 108 may be fully incorporated into the PubMed database system. Further, one or more of the sources may temporarily or permanently be included on the client computer 104 so that access across the packet network 102 is not necessary.
- Figure 2 provides another look at the architecture of a collaborative curation system at the database level.
- a client computer 204 (which may be a personal computer or other computing device, including a PDA or handheld device) may be connected to various reference databases 206 as well as a curation database 202. When the client computer 204 retrieves references (such as texts, titles, citations, abstracts, etc.) from the reference database, the curation database is used to locate supporting information regarding the retrieved reference.
- references such as texts, titles, citations, abstracts, etc.
- a more detailed architecture of one possible curation system is illustrated in Figure 3.
- References of interest to researchers are stored in one or more databases 302.
- a download agent 304 is capable of automatically retrieving references from the reference databases 302 and storing the reference text in a text database 306.
- An extractor system consults this stored text and operates software to analyze the text and to extract curation information from the text.
- Other databases 310 may likewise include curation information relating to the references stored in the reference databases 302. The system can benefit from this information by implementing another download agent 312. This agent populates another database 314 with the downloaded curation information. Because the downloaded information may not be in the same format as that output by the extractor systems 308, a data format exchange system 316 operates software to translate data formats if necessary.
- a user interface 320 enables a user 322 to browse through the facts that comprise the curation information, to vote on the accuracy of those facts, to add new facts or modify facts, to invoke the extraction system, or to handle user account information.
- An exemplary curation system makes use of the interactions between an end user, that user's client computer running browser software with a browser helper, a curation server, and a reference server. Such interactions are described herein with respect to one particular embodiment.
- the curation server is a server affiliated with the CBioC project, described at ⁇ www.cbioc.org>
- the reference server is a server affiliated with the PubMed service, available at ⁇ www.pubmed.gov>. It should be understood that these examples are chosen for the sake of clarity, but that the invention is not limited to these particular services or the servers and databases affiliated with them.
- Figure 4 illustrates how a user's attempt to access the PubMed database automatically triggers the client computer to begin setting up a session with the CBioC server.
- the user attempts to navigate to the PubMed Web site by, for example, selecting an appropriate bookmark or hyperlink, or typing a uniform resource locator (URL) addressing the PubMed Web site.
- the browser software attempts to retrieve the PubMed welcome page by sending a request to the PubMed server.
- the request may be, for example, a GET request in the hypertext transfer protocol (HTTP).
- HTTP hypertext transfer protocol
- the browser helper in step 6, detects that the browser has requested a page from the PubMed Web site. It may do this by, for example, monitoring HTTP requests generated by the browser software and determining whether such a request indicates that the user is attempting to navigate to the PubMed Web site. In other embodiments, software at the client computer may use other techniques to detect a user's attempt to access the PubMed Web site.
- the browser helper in step 8 requests the browser to open a curation region of the browser's user interface.
- This curation region may be, for example, a new window, a new tab, or a new frame.
- this curation region is in the form of a "Web band," which appears as a separate region below a main window of the browser's client area.
- the browser in step 10 opens the Web band.
- the request for the PubMed welcome page that was generated by the browser in step 4 is received by the PubMed server, and the PubMed server, in response, sends the PubMed welcome page to the client computer in step 12.
- the computer receives the welcome page and displays it in step 14 in the main window of the browser's client area.
- the main window of the client area operates as it would if the browser helper were not present.
- the user can make use of a browsers "address bar" or navigation buttons, and otherwise navigate through the Web (including to Web sites unrelated to research) without interference from the browser helper.
- some (but not all) interactions taking place through the main window cause the browser helper to take action, and, where appropriate, to display relevant information in the Web band.
- the browser helper in step 16 requests a welcome page from the CBioC server.
- the CBioC server receives the request and sends the CBioC welcome page to the client computer, where the browser software causes it to be displayed in the Web band.
- the welcome page prompts the user to enter his or her username (which may be an email address) and password.
- the user obliges in step 22, and the client computer sends the username and password to the CBioC server (step 24) for authentication (step 26). Assuming authentication is successful, the CBioC server initiates a session with the user in step 28.
- the establishment of a session may involve setting up a session identifier stored as a "cookie" at the client computer.
- the client computer can send the session identifier to the CBioC server, so that the user need not re-enter a username and password with each transaction.
- the level of authentication required may be minimal or nonexistent.
- authentication may permit different levels of access for different users. For example, no authentication may be required for read-only access, whereas some authentication may be required for read/write access.
- the CBioC server sends to the client computer a page indicating that the login has been successful.
- this page includes statistical information on the use of the CBioC service (step 30).
- the client computer displays this statistical information in the Web band.
- PubMed server receives these parameters in step 40 and, in response, runs a search of the PubMed database (step 42).
- the PubMed database sends the results of the search to the client computer, where they are displayed in step 46 by the browser software.
- the search results are in the form of article citations, and each of these citations is hyperlinked to retrieve the abstract of the article. If the user clicks on such a hyperlinked citation, as in step 48, the browser sends a request for the selected reference.
- the PubMed server receives and process the request and sends the selected reference to the client computer (step 52), where it is displayed, as expected by the user, in the main window (step 54).
- the browser helper also determines that the user has attempted to view a particular reference, and this determination calls the browser helper into action.
- the browser helper determines that the user is attempting to view a particular reference by detecting in step 56 the reference request that was generated in step 50.
- the browser helper parses HTTP GET requests generated by the browser helper to determine whether a particular reference has been requested, and, if so, the identity of that reference.
- the browser helper, watching software, or other client- side software may use other techniques to determine automatically when the user is attempting to view a reference. For example, instead of parsing outgoing request messages, the software could watch for incoming data that is indicative of a reference.
- the client-side software can provide a front-end user interface through which the user searches for and/or selects references; in this case, the client-side software reacts to the user's selections by generating the appropriate requests to the reference server and to the curation server.
- the browser helper Having detected in step 56 the request for a reference, the browser helper generates a request for curation information and sends that request to the CBioC server in step 60. If curation information for that reference is already available (for example, it is stored in a curation database), it is retrieved in step 62 and sent to the client computer in step 64.
- the CBioC server If, on the other hand, no curation information is already available to the CBioC server, that server operates to extract useful information automatically.
- the CBioC server requests its own copy of the reference being accessed by the user.
- the PubMed server sends that reference to the CBioC server (step 68), and the CBioC server automatically extracts curation information relating to the reference.
- the automatically extracted curation information may include, for example, the names of proteins, organs, diseases, genes, biological processes — and interactions between them — that are mentioned in the article.
- This automatically-extracted curation information is then stored in a database by the CBioC server (step 72) and is used to respond to the current request (from step 58) and to respond to later requests relating to the same reference.
- the curation information Once the curation information has been sent to the client computer, it is displayed in the Web band of the browser client area (step 74).
- users of the curation system not only benefit from receiving curation information associated with a reference, they are also able to contribute to the curation information, for example by adding new information and by judging the accuracy of the information already present.
- FIG. 6 is a schematic illustration showing the layout of a user interaction screen displayed by the browser and browser helper software.
- a client area 600 is divided into a reference interface 602, and a Web band 604.
- the PubMed welcome page is displayed in the main window.
- the PubMed welcome page includes a text box 606 in which a user types search parameters, such as key words, together with command buttons such as the "Go" button (608) to submit the search parameters.
- the Web band displays text boxes prompting the user to enter his or her email address (box 610) and password (box 612) and a command button 614 that causes the address and password to be sent to the CBioC server, in order to log in to the server.
- Figure 7 illustrates curation information that is displayed in the Web band 604 when the user navigates to an abstract on the PubMed server.
- Figure 7 may illustrate the outcome of step 74 in Figure 5, when curation information is received by the client computer and displayed in the Web band.
- the curation information relating to the requested reference is illustrated in the format of a tabbed table. Through the selection of various tabs, the user can view curation information relating to Protein/Protein interactions (tab 702), Gene/Disease interactions (tab 704), Gene/Organ interactions (tab 706), and Gene/Bio Process interactions (tab 708).
- the curation information under each tab relating to each class of interaction, is organized in a fashion analogous to the Protein/Protein interactions illustrated under tab 702. Accordingly, for the sake of simplicity, only the contents of tab 702 are illustrated herein.
- curation information is provided relating to interactions between proteins discussed in the reference.
- the curation information is organized in a table in which each row describes an interaction.
- the column “Protein 1” names a protein that has some role in the interaction.
- the column “Interaction” names the type of interaction or relationship between proteins (e.g., “regulator,” “binds,” “inhibits,” “interacts,” “stimulates,” “depletion,” “phenotypes,” “repressed,” “interact,” “expressed,” among others).
- the column “Protein 2” names a second protein involved in the interaction. Although it is not necessary, it is desirable for these first three columns to read as a declarative sentence relating to a protein/protein interaction.
- the first three columns together read "KAPl stimulates p53 HDACl complex," which conveys information in a form easy for a researcher to identify and understand.
- the interactions between proteins discussed in a reference is less clear, it may not be possible or desirable to present information in such a straightforward way.
- the nature of the interaction may be described more broadly, as in the fourth row, which reads "MDM2 interacts KAPl.” With this information, researchers looking into the relationship between MDM2 and KAPl can at least identify the reference as one that refers to such an interaction, even where no more precise formulation would be appropriate.
- the column "Source” identifies the person or other source responsible for entering the interaction in the curation system.
- the "Source” may be the screen name or other identifier associated with the human user (e.g., "readerl”).
- the name software or other automatic agent may be identified as the "Source.” For example, interactions may be entered automatically by the IntEx system, described in S. T.
- a user with read-write access to the curation information can not only review information that has been entered by others (or automatically); he or she also has the capability of adding new curation information. For example, a user may review a reference and note that a particular interaction is not listed with the curation information. That user has the capability to manually add information on that interaction to the corpus of curation information.
- text boxes are provided for a user to enter information relating to a first protein (box 710), a second protein (box 712), and the nature of the interaction (box 714).
- the user By clicking an "Add" button (command button 716), the user causes the data he or she has entered to be transmitted to the curation server (e.g., the CBioC server) for storage and later retrieval.
- the curation server e.g., the CBioC server
- Such a user is further provided with the ability to modify curation information already stored by the curation server.
- the user can initiate the making of these modifications by clicking the "Modify" command button (e.g., button 718) in the row of interest.
- the history of modifications to curation information is stored, and changes can be accessed by users for review.
- Much of the curation information may have been generated automatically, or it may have been generated by individuals whose qualifications and biases are not known to a particular user. To encourage an appropriate level of confidence in the curation system, the system permits users to rate the accuracy of the curation information relating to each interaction. This rating information is displayed (preferably in summary form) along with other interaction information, providing for each user a sense of whether this information has been accepted by the research community at large.
- At least two types of rating information can be provided.
- a user can provide feedback on whether the interaction has been accurately characterized. An interaction may be inaccurately characterized if, for example, an automatic interaction extractor erroneously identifies an "interaction" between two non-interacting proteins discussed in the reference. Similarly, a human user may make a typographical or other error while typing interaction information.
- a second kind of feedback a user can provide relates to how well-documented a particular interaction is in the associated reference. For example, a reference may refer in passing to an observation that MDM2 is a regulator of p53, and that interaction may be properly extracted in the curation information, but it would be helpful for researchers to know that the reference provides little discussion of or support for the interaction.
- buttons "Yes” (720) and "No" (722) corresponding to each interaction A user may select one of the buttons to vote on whether he or she believes the interaction has been correctly extracted, and each user's vote is sent to and stored by the curation server.
- each vote is associated with a user identifier.
- This association has at least two benefits. First, it can be used to prevent a user from (possibly inadvertently) voting more than once on the same transaction. Second, the association can be used to permit a user to change his or her vote, possibly as a result of gaining greater understanding of the subject matter of the interaction.
- the outcome of this first kind of feedback is displayed in columns entitled “% Approval” and “Votes.”
- the proportion of "yes” votes may be illustrated by a bar graphic (e.g., graphic 724), among other means.
- the "Votes” column indicates the number of votes entered regarding each interaction.
- a column entitled “Evidence Level” is provided.
- the “Rate” command button e.g., button 726
- the user is provided with the ability to enter feedback on the level of confidence in a particular interaction.
- this feedback is provided through a feedback interface such as a pop-up box 800, illustrated in Figure 8a, which appears over the curation region of the browser's user interface.
- a user may click on icons indicating zero through five stars to enter his or her opinion on the level of support for the selected interaction.
- clicking on star 804 indicates a selection of "two stars," while clicking on the crossed-out star 808 indicates “no stars.”
- a textual description 806 of each possible star rating appears when a user's cursor is held over one of the star icons (e.g., when the user "mouses over" an icon).
- Such a textual description can provide short normative guidelines as to the meaning of the star ratings.
- descriptions may be provided as follows:
- a user may optionally type a brief reason or clarification for his or her rating.
- a user's rating is sent to the curation server for storage. From the curation region of the user interface, as illustrated in Figure 7, a user may review the feedback relating to each interaction. In the column entitled "Average Confidence,” the curation information indicates whether an interaction has been rated at all and, if so, what the average of those ratings is. Where an average is provided, e.g. the rating "3" indicated at entry 728, the user may access details on that rating by clicking on the rating itself. In response, the client computer obtains from the curation server and displays information available on that rating.
- This information may be displayed in tabular format, as illustrated in Figure 8b.
- table 810 provides details on the confidence level.
- Each row of the table provides information on a rating provided by a user.
- the table identifies the user (by screen name, for example), the rating level, the time at which the rating was entered, and any comments entered by the user.
- the tabular format illustrated in Figure 8b may correspond to the structure of a relational database in which this information is stored by the curation server.
- Other curation information illustrated in Figure 7 may be hyperlinked to enable users to learn more about an area of interest. Hyperlinked information is illustrated in Figure 7 with underlining.
- protein names may be hyperlinked, and a user's selection of a protein name can cause other curation information relating to that protein to be retrieved from the curation server.
- Each interaction may be accompanied by a "Related Articles" hyperlink, the selection of which causes the curation server to identify other articles relating to the interaction.
- the CBioC system described herein runs as a Web browser extension and allows unobtrusive use of the system during the regular course of research in PubMed. However, this system can also be accessed directly, without having to install a browser plug-in.
- the CBioC system described herein allows users to search the curation database for all facts related to a particular protein, gene, disease, or interaction word by typing the relevant term in a Search box within the CBioC Web band.
- CBioC automatically expands search terms with known synonyms of the terms.
- the facts available for a set of abstracts can be displayed by typing a comma-separated list of their PubMed identification numbers in the search box.
- FIG. 9 a curation system was described in Section B, above, with reference to the particular example of the CBioC service used in conjunction with the PubMed database. Such a combination is not the only possible implementation of the systems described herein. More generally, a method implemented by the systems described herein is illustrated in Figure 9. This method may be implemented by software operating on a user's computer.
- a determination is made that a user has accessed a first reference.
- knowledge elements related to first reference are provided to the user.
- the knowledge elements may be provided through a graphic user interface at a display screen, for instance.
- the user is enabled to provide feedback regarding the knowledge elements. According to a preferred embodiment, the feedback provides user input on the relevance of the knowledge elements to the reference.
- Figure 10 illustrates the functional architecture of a curation system that is not necessarily tied to the CBioC or PubMed services. Although Figure 10 is not limited to the CBioC or PubMed services, it should be noted that Figure 10 also provides furher detail on the type of curation system that implements the methods described with reference to those services in Figures 4 through 7.
- client computers 504 are in communication with a network 506. Through the network 506, those client computers interact with a curation system 508, which may be implemented on one or more servers, whether at a single location or spread among different locations on the same or on different networks.
- a curation system 508 may be implemented on one or more servers, whether at a single location or spread among different locations on the same or on different networks.
- the system provides a browser helper 520, which preferably is sent as software (for example, as a dynamically-linked library, or DLL file) for installation on those computers.
- the browser helper includes a specified text watcher logic 522 to monitor the clients' requests for reference data and communication logic 524 to communicate the output from the text watcher logic 522.
- Web band logic 526 Provided to the client computers either together with or separately from the browser helper logic 520 is Web band logic 526, which includes the Web band object logic 528 and communication logic 532.
- the browser helper logic is installed on each client computer and, whenever the user navigates to a Web page from where he or she can access an article or an abstract, the Web band logic is invoked.
- the browser helper logic 528 may also cause the user's Web browser to display a toggle button, through which the user may activate or deactivate the Web band logic.
- the browser helper may invoke the Web band logic by causing the client computer to download the Web band logic 526 in the form of, for example, a Web page with JavaScript or ActiveX controls.
- At least one server operates a Web application 510 offering Web forms 512, server controls 514, an extractor system adapter 516, and a database connection adapter 518.
- Knowledge element databases 534 store knowledge elements relating to references in one or more reference databases (not shown), and extractor systems 536 operate to automatically extract knowledge elements from those databases.
- the browser helper logic 520 informs the Web band logic 526 of this fact.
- the Web band logic 526 triggers the Web application 510 to provide relevant knowledge elements, such as curation information. If such knowledge elements are available in a knowledge element database 534, the Web application 510 retrieves those elements and sends them to the client computer 504. If they are not available, the Web application invokes an extractor system 536 to automatically generate knowledge elements and to store those elements in the database 534.
- FIG 11 illustrates a user interface 400 such as may be displayed in the client area of a Web browser on a client computer 504.
- the user interface may take any number of forms as the system is used for its various purposes such as browsing facts, voting for facts, ranking facts, adding and modifying facts, adding or modifying a schema, invoking the extractor system, user management, searching, and so on.
- a reference database interface 402 may contain the text of a reference, or other biographical information regarding the reference, such as the title, author, source, etc.
- the reference database interface 402 may additionally/alternatively contain search functionality, such as a search box or other query form that may be used to search one or more reference databases (as in box 606 of Figure 6).
- a knowledge element interface 418 is provided and displays information regarding various knowledge facts (or, perhaps more generally knowledge elements). Three knowledge facts 408, 410, and 412 are shown in order of importance and given ranks, 1, 2, and 3. Each knowledge fact may include one or more mechanisms for learning more about the particular fact. Here, that mechanism is a link 420. A user may select the link to browse through information regarding the fact.
- Another interface portion may provide facilities for modifying a fact 414 or adding a new fact 416 or voting whether a particular fact is useful and/or relevant. According to a further embodiment a user may "drag and drop" the facts to modify a ranking of the utility and/or relevance of the facts.
- Various other examples of feedback techniques include a button, a link, a slider, or a text box
- This information provided by the user may be related to the source containing the knowledge facts and updated therein.
- the system watches the user's access of the web through a browser's windows. Whenever the researcher accesses a web page from where she can access an article or an abstract, the CBioC system is invoked and an interaction frame is created, as shown in Figure 5.
- a toggle button may be provided in the browser to toggle the interaction frame on and off.
- databases that can be used as alternatives to or on addition to the PubMed database are Nature and Science.
- the extractor system can operate its own extraction software, e.g. by using a download agent, and parsing/matching text with a textual database. Other known methods can also be used to automatically determine a plurality of knowledge elements to associate with various references.
- Other databases (BioPax, DIF, Reactome, e.g.) may also be used to automatically populate the knowledge element database.
- a data format exchange server may be used to download relevant data from the databases and convert them to the proper format.
- a user may be encouraged to add new elements to the CBioC database and also to rank various elements that have been proposed by the system or by other users.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Biophysics (AREA)
- General Physics & Mathematics (AREA)
- Biotechnology (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioethics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Computational Biology (AREA)
- Epidemiology (AREA)
- Primary Health Care (AREA)
- Public Health (AREA)
- Information Transfer Between Computers (AREA)
- Computer And Data Communications (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Document Processing Apparatus (AREA)
Abstract
L'invention concerne un système mis en réseau qui fonctionne pour faciliter l'entretien (« curation ») collaboratif, en particulier des connaissances biomédicales à partir d'un texte et de résumés biomédicaux. Lorsqu'un chercheur utilise un ordinateur client pour accéder à une référence dans une base de données, telle que la base de données PubMed, l'ordinateur client se connecte automatiquement à un serveur d'entretien, tel que CBioC, et tente de localiser des connaissances entretenues existantes concernant cette référence. En l'absence de connaissances entretenues existantes, le serveur d'entretien récupère une copie des références pour lui-même et extrait automatiquement des informations d'entretien grâce à une analyse textuelle de la référence. Les informations d'entretien sont envoyées à l'ordinateur client et affichées par celui-ci. L'utilisateur peut être autorisé à voter au sujet de la justesse des informations affichées, à revoir les informations et/ou à ajouter de nouvelles informations concernant la référence. Ces informations nouvelles ou revues sont stockées par le serveur d'entretien pour un accès ultérieur.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US80839106P | 2006-05-25 | 2006-05-25 | |
US60/808,391 | 2006-05-25 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2007139999A2 true WO2007139999A2 (fr) | 2007-12-06 |
WO2007139999A3 WO2007139999A3 (fr) | 2008-08-28 |
Family
ID=38779272
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2007/012624 WO2007139999A2 (fr) | 2006-05-25 | 2007-05-25 | ENTRETIEN collaboratiF de connaissances À partir d'un texte et de rÉsumÉs |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2007139999A2 (fr) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2015123542A1 (fr) * | 2014-02-14 | 2015-08-20 | Medaware Systems, Inc. | Développement et utilisations de base de données de recherche biomédicale |
US20150261859A1 (en) * | 2014-03-11 | 2015-09-17 | International Business Machines Corporation | Answer Confidence Output Mechanism for Question and Answer Systems |
US9628551B2 (en) | 2014-06-18 | 2017-04-18 | International Business Machines Corporation | Enabling digital asset reuse through dynamically curated shared personal collections with eminence propagation |
US10176157B2 (en) | 2015-01-03 | 2019-01-08 | International Business Machines Corporation | Detect annotation error by segmenting unannotated document segments into smallest partition |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6778979B2 (en) * | 2001-08-13 | 2004-08-17 | Xerox Corporation | System for automatically generating queries |
-
2007
- 2007-05-25 WO PCT/US2007/012624 patent/WO2007139999A2/fr active Application Filing
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2015123542A1 (fr) * | 2014-02-14 | 2015-08-20 | Medaware Systems, Inc. | Développement et utilisations de base de données de recherche biomédicale |
EP3105687A4 (fr) * | 2014-02-14 | 2017-10-11 | Medaware Systems, Inc. | Développement et utilisations de base de données de recherche biomédicale |
US20150261859A1 (en) * | 2014-03-11 | 2015-09-17 | International Business Machines Corporation | Answer Confidence Output Mechanism for Question and Answer Systems |
US20160026378A1 (en) * | 2014-03-11 | 2016-01-28 | International Business Machines Corporation | Answer Confidence Output Mechanism for Question and Answer Systems |
US9628551B2 (en) | 2014-06-18 | 2017-04-18 | International Business Machines Corporation | Enabling digital asset reuse through dynamically curated shared personal collections with eminence propagation |
US10298676B2 (en) | 2014-06-18 | 2019-05-21 | International Business Machines Corporation | Cost-effective reuse of digital assets |
US10176157B2 (en) | 2015-01-03 | 2019-01-08 | International Business Machines Corporation | Detect annotation error by segmenting unannotated document segments into smallest partition |
US10235350B2 (en) | 2015-01-03 | 2019-03-19 | International Business Machines Corporation | Detect annotation error locations through unannotated document segment partitioning |
Also Published As
Publication number | Publication date |
---|---|
WO2007139999A3 (fr) | 2008-08-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Makri et al. | A library or just another information resource? A case study of users' mental models of traditional and digital libraries | |
US7216121B2 (en) | Search engine facility with automated knowledge retrieval, generation and maintenance | |
US8275666B2 (en) | User supplied and refined tags | |
KR101191531B1 (ko) | 인라인 문맥 질의들을 사용하는 검색 시스템들 및 방법들 | |
US8521675B2 (en) | Integrated automatic user support and assistance | |
JP6403340B2 (ja) | 引用を処理、提示、および、推奨するためのシステム、方法、および、ソフトウェア | |
US8484184B2 (en) | Navigation assistance for search engines | |
US7421441B1 (en) | Systems and methods for presenting information based on publisher-selected labels | |
US20020169771A1 (en) | System & method for facilitating knowledge management | |
US20070088695A1 (en) | Method and apparatus for identifying documents relevant to a search query in a medical information resource | |
US7065536B2 (en) | Automated maintenance of an electronic database via a point system implementation | |
US20060271561A1 (en) | System and method for conducting tailored search | |
US20210391075A1 (en) | Medical Literature Recommender Based on Patient Health Information and User Feedback | |
WO2015017726A1 (fr) | Application pour construire des cartes de résultats de recherche | |
WO2007139999A2 (fr) | ENTRETIEN collaboratiF de connaissances À partir d'un texte et de rÉsumÉs | |
JP5897991B2 (ja) | 専門家評価情報管理装置 | |
Laulederkind et al. | The rat genome database: genetic, genomic, and phenotypic data across multiple species | |
Xie et al. | User involvement and system support in applying search tactics | |
Laulederkind et al. | Exploring genetic, genomic, and phenotypic data at the rat genome database | |
Smith | Domain‐independent search expertise: Gaining knowledge in query formulation through guided practice | |
Jácome et al. | BIOMedical Search Engine Framework: lightweight and customized implementation of domain-specific biomedical search engines | |
Moulaison et al. | Beyond failure: Potentially mitigating failed author searches in the online library catalog through the use of linked data | |
US8024338B2 (en) | Systems, methods, and interfaces for reducing executions of overly broad user queries | |
US20250231952A1 (en) | System and method for ranking search engine results | |
Yilmaz et al. | Snippet Generation Using Local Alignment for Information Retrieval (LAIR) |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 07795428 Country of ref document: EP Kind code of ref document: A2 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 07795428 Country of ref document: EP Kind code of ref document: A2 |