Skip to content
GitLab
Projects
Groups
Snippets
/
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
texta
texta-mlp-python
Commits
35497c3c
Commit
35497c3c
authored
Apr 08, 2022
by
Wael Ramadan
Browse files
refactor no need to sets of json
parent
28f42f75
Pipeline
#8006
passed with stage
in 8 minutes and 25 seconds
Changes
1
Pipelines
1
Hide whitespace changes
Inline
Side-by-side
texta_mlp/document.py
View file @
35497c3c
import
json
import
math
from
typing
import
List
,
Optional
...
...
@@ -144,10 +143,8 @@ class Document:
def
remove_duplicate_facts
(
facts
:
List
[
dict
]):
if
facts
:
facts
=
Document
.
handle_null_values_in_facts
(
facts
)
set_of_jsons
=
{
json
.
dumps
(
fact
,
sort_keys
=
True
,
ensure_ascii
=
False
)
for
fact
in
facts
}
without_duplicates
=
[
json
.
loads
(
unique_fact
)
for
unique_fact
in
set_of_jsons
]
without_duplicates_ignored_keys
=
list
(
Document
.
remove_duplicates_with_ignored_keys
(
without_duplicates
,
[
"id"
,
"source"
]))
return
without_duplicates_ignored_keys
without_duplicates
=
list
(
Document
.
remove_duplicates_with_ignored_keys
(
facts
,
[
"id"
,
"source"
]))
return
without_duplicates
else
:
return
[]
...
...
Write
Preview
Supports
Markdown
0%
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment