Journal article

Building queries for prior-art search

  • Mahdabi, Parvaz Facoltà di scienze informatiche, Università della Svizzera italiana, Svizzera
  • Keikha, Mostafa Facoltà di scienze informatiche, Università della Svizzera italiana, Svizzera
  • Gerani, Shima Facoltà di scienze informatiche, Università della Svizzera italiana, Svizzera
  • Landoni, Monica Facoltà di scienze informatiche, Università della Svizzera italiana, Svizzera
  • Crestani, Fabio Facoltà di scienze informatiche, Università della Svizzera italiana, Svizzera
Show more…
    2011
Published in:
  • Lecture notes in computer science. - Springer. - 2011, vol. 6653, no. -, p. 3-15
English Prior-art search is a critical step in the examination procedure of a patent application. This study explores automatic query generation from patent documents to facilitate the time-consuming and labor-intensive search for relevant patents. It is essential for this task to identify discriminative terms in different fields of a query patent, which enables us to distinguish relevant patents from non-relevant patents. To this end we investigate the distribution of terms occurring in different fields of the query patent and compare the distributions with the rest of the collection using language modeling estimation techniques. We experiment with term weighting based on the Kullback-Leibler divergence between the query patent and the collection and also with parsimonious language model estimation. Both of these techniques promote words that are common in the query patent and are rare in the collection. We also incorporate the classification assigned to patent documents into our model, to exploit available human judgements in the form of a hierarchical classification. Experimental results show that the retrieval using the generated queries is effective, particularly in terms of recall, while patent description is shown to be the most useful source for extracting query terms.
Language
  • English
Classification
Computer science and technology
License
License undefined
Identifiers
Persistent URL
https://n2t.net/ark:/12658/srd1318549
Statistics

Document views: 38 File downloads:
  • crestani_LNCS_2011.pdf: 109