Patent document clustering with deep embeddings

23 March 2020

journal article
research article
Published by Springer Science and Business Media LLC in Scientometrics

Vol. 123 (2), 563-577
https://doi.org/10.1007/s11192-020-03396-7

Abstract

The analysis of scientific and technical documents is crucial in the process of establishing science and technology strategies. One popular method for such analysis is for field experts to manually classify each scientific or technical document into one of several predefined technical categories. However, not only is manual classification error-prone and expensive, but it also requires extended efforts to handle frequent data updates. In contrast, machine learning and text mining techniques enable cheaper and faster operations, and can alleviate the burden on human resources. In this paper, we propose a method for extracting embedded feature vectors by applying a neural embedding approach for text features in patent documents and automatically clustering the embedding features by utilizing a deep embedding clustering method.

Keywords

68U15

Funding Information

National Research Foundation of Korea (NRF-2015R1C1A1A01056185)
National Research Foundation of Korea (2018R1D1A1B07045825)

This publication has 21 references indexed in Scilit:

Monitoring trends of technological changes based on the dynamic patent lattice: A modified formal concept analysis approach
Technological Forecasting and Social Change, 2011
Text clustering using frequent itemsets
Knowledge-Based Systems, 2010
Detecting emerging research fronts based on topological measures in citation networks of scientific publications
Technovation, 2008
Cluster-based patent retrieval
Information Processing & Management, 2007
Reducing the Dimensionality of Data with Neural Networks
Science, 2006
The future of patent information––a user with a view
World Patent Information, 2003
Text mining applied to patent mapping: a practical business case
World Patent Information, 2003
Comparing partitions
Journal of Classification, 1985
A Method for Comparing Two Hierarchical Clusterings
Journal of the American Statistical Association, 1983
Dissemination of patent information
World Patent Information, 1982

Cited by 33 articles