Patent document clustering with deep embeddings
- 23 March 2020
- journal article
- research article
- Published by Springer Science and Business Media LLC in Scientometrics
- Vol. 123 (2), 563-577
- https://doi.org/10.1007/s11192-020-03396-7
Abstract
The analysis of scientific and technical documents is crucial in the process of establishing science and technology strategies. One popular method for such analysis is for field experts to manually classify each scientific or technical document into one of several predefined technical categories. However, not only is manual classification error-prone and expensive, but it also requires extended efforts to handle frequent data updates. In contrast, machine learning and text mining techniques enable cheaper and faster operations, and can alleviate the burden on human resources. In this paper, we propose a method for extracting embedded feature vectors by applying a neural embedding approach for text features in patent documents and automatically clustering the embedding features by utilizing a deep embedding clustering method.Keywords
Funding Information
- National Research Foundation of Korea (NRF-2015R1C1A1A01056185)
- National Research Foundation of Korea (2018R1D1A1B07045825)
This publication has 21 references indexed in Scilit:
- Monitoring trends of technological changes based on the dynamic patent lattice: A modified formal concept analysis approachTechnological Forecasting and Social Change, 2011
- Text clustering using frequent itemsetsKnowledge-Based Systems, 2010
- Detecting emerging research fronts based on topological measures in citation networks of scientific publicationsTechnovation, 2008
- Cluster-based patent retrievalInformation Processing & Management, 2007
- Reducing the Dimensionality of Data with Neural NetworksScience, 2006
- The future of patent information––a user with a viewWorld Patent Information, 2003
- Text mining applied to patent mapping: a practical business caseWorld Patent Information, 2003
- Comparing partitionsJournal of Classification, 1985
- A Method for Comparing Two Hierarchical ClusteringsJournal of the American Statistical Association, 1983
- Dissemination of patent informationWorld Patent Information, 1982