Have a personal or library account? Click to login
A Use Case of Patent Classification Using Deep Learning with Transfer Learning Cover

A Use Case of Patent Classification Using Deep Learning with Transfer Learning

Open Access
|Aug 2022

Figures & Tables

Figure 1

Evolution of patent applications and grants at IP5 offices from 2009 to 2019 (IP5, 2019).
Evolution of patent applications and grants at IP5 offices from 2009 to 2019 (IP5, 2019).

Figure 2

Patents applications in Portugal since 2010.
Patents applications in Portugal since 2010.

Figure 3

Number of patents by section.
Number of patents by section.

Figure 4

Frequency of classes by the number of patents.
Frequency of classes by the number of patents.

Figure 5

Boxplot of text size by section with a) outliers and b) without outliers.
Boxplot of text size by section with a) outliers and b) without outliers.

Figure 6

Wordcloud with the more frequent words by section.
Wordcloud with the more frequent words by section.

Figure 7

Applied methodology.
Applied methodology.

Figure 8

Classes G06 most frequent words in parallel to most similar classes.
Classes G06 most frequent words in parallel to most similar classes.

Mean F1 score (cross-validation k=5) with different feature engineering methods_

ModelF1_weighted (%)
LinearSVC (baseline)60.8
CNN50
DistilBERT Multilingual50.1
BiLSTM57
ULMFiT57
BERT-Base Multilingual59.5
BERTimbau63.6

Precision, Recall and F1 score by Section_

SectionPrecisionRecallF1
A0.79230.70.7433
H0.68050.72480.7019
C0.65150.74270.6941
E0.59810.59230.5952
F0.54460.53930.5419
B0.52910.53880.5339
G0.49030.46330.4764
D0.50980.40410.4509

Patent classification related studies_

AuthorsFeature EngineeringAlgorithmSectionLanguageDataset sizeNumber of classes
(Trappey et al., 2006)Key phrases frequency based on TF-IDFNeural Networksfull documentEnglish300 training124 test9
(Derieux et al., 2010)Terms extraction and semantic relationSVMfull documentEnglish, German, French985 training2000 test630
(Trappey et al., 2013)Key phrases frequency based on TF-IDFOntology-Based Neural Networkfull documentEnglish333 training160 test23
(Zhang, 2014)-SVM-English50005
(Wu et al., 2016)SOM, KPCASVMfull documentEnglish60.0007
(Li et al., 2018)Skip-gramCNNtitle and abstractEnglish742.097 training 1350 test637
(Risch & Krestel, 2019)Domain-specific FastText word embeddingsBi-directional GRUtitle and abstractEnglish~1.7M training~300.000 test637
(Abdelgawad et al., 2020)GloVe, Word2Vec, FastTextHierarchical SVM and CNN with BOHB (Bayesian Optimization hyperband)title, abstract, description, and claimsEnglish75.000 training28.926 test451
(Lee & Hsiang, 2020)-BERT-BaseclaimsEnglish1,950,247 training150,000 test632

F1 score on the test set_

ModelF1_weighted (%)
LinearSVC (baseline)60.8
CNN50
DistilBERT Multilingual50.1
BiLSTM57
ULMFiT57
BERT-Base Multilingual59.5
BERTimbau63.6

IPC Areas of Technology_

SectionDescription
AHuman Necessities
BPerforming Operations; Transporting
CChemistry; Metallurgy
DTextiles; Paper
EFixed Constructions
FMechanical Engineering; Lighting; Heating; Weapons; Blasting Engines or Pumps
GPhysics
HElectricity

Features used in the analysis_

FeatureDescription
idPatent internal identification
TitleDescriptive name of the patent
ClaimsThe legal scope of the invention, including delimitations and application field
AbstractA brief description of the invention presented in the patent
SectionIPC 1st level classification code
ClassIPC 2nd level classification code
SubclassIPC 3rd level classification code
Main groupIPC 4th level classification code
SubgroupIPC 5th level classification code
DOI: https://doi.org/10.2478/jdis-2022-0015 | Journal eISSN: 2543-683X | Journal ISSN: 2096-157X
Language: English
Page range: 49 - 70
Submitted on: Mar 12, 2022
Accepted on: Jul 4, 2022
Published on: Aug 12, 2022
Published by: Chinese Academy of Sciences, National Science Library
In partnership with: Paradigm Publishing Services
Publication frequency: 4 issues per year

© 2022 Roberto Henriques, Adria Ferreira, Mauro Castelli, published by Chinese Academy of Sciences, National Science Library
This work is licensed under the Creative Commons Attribution 4.0 License.