Skip to content
Snippets Groups Projects
Commit 1408e5df authored by Christian Boulanger's avatar Christian Boulanger
Browse files

amend

parent dd068e7b
No related branches found
No related tags found
No related merge requests found
%% Cell type:markdown id:a7894c78ec06bd10 tags:
# Translate TEI/bibl to final gold standard schema
%% Cell type:code id:2a90251a tags:
``` python
%load_ext autoreload
%autoreload 2
```
%% Output
The autoreload extension is already loaded. To reload it, use:
%reload_ext autoreload
%% Cell type:markdown id:bac6fffc tags:
Add an `xml:id` attribute to all `bibl` elements so that they can be matched later.
%% Cell type:code id:338af7ddf4cc739d tags:
``` python
from lib.gold_standard import add_id_to_bibl
add_id_to_bibl('./tei-bibl-corrected')
```
%% Output
- Processing ./tei-bibl-corrected\10.1111_1467-6478.00057.xml
- Processing ./tei-bibl-corrected\10.1111_1467-6478.00080.xml
- Processing ./tei-bibl-corrected\10.1515_zfrs-1980-0103.xml
- Processing ./tei-bibl-corrected\10.1515_zfrs-1980-0104.xml
%% Cell type:markdown id:f18f6515 tags:
Create `biblStruct` from `bibl`:
%% Cell type:code id:d39d9f75 tags:
``` python
from lib.xslt import transform
transform(xslt_path='lib/xslt/convert_tei-to-biblstruct_bibl.xsl',
input_path='tei-bibl-corrected',
output_path='tei-biblStruct',
rename_extension=('-bibl_biblStruct.TEIP5.xml','.biblStruct.xml')).stderr
```
%% Output
Applied lib\xslt\convert_tei-to-biblstruct_bibl.xsl to files in tei-bibl-corrected and saved result in tei-biblStruct.
''
%% Cell type:code id:2cc1a0d6 tags:
``` python
from lib.gold_standard import create_all_gold_standards
create_all_gold_standards('tei-bibl-corrected',
'tei-biblStruct',
'gold',
verbose=False)
```
%% Output
### Processing 10.1111_1467-6478.00057
Files: [TEI/bibl](tei-bibl-corrected/10.1111_1467-6478.00057.xml) | [TEI/biblStruct](tei-biblStruct/10.1111_1467-6478.00057.biblstruct.xml) | [TEI/biblStruct Gold Standard](gold/10.1111_1467-6478.00057.xml)
Files: [TEI/bibl](tei-bibl-corrected/10.1111_1467-6478.00057.xml) | [TEI/biblStruct](tei-biblStruct/10.1111_1467-6478.00057.biblstruct.xml) | [Gold Standard](gold/10.1111_1467-6478.00057.xml)
Unexpected exception formatting exception. Falling back to standard exception
### Processing 10.1111_1467-6478.00080
Files: [TEI/bibl](tei-bibl-corrected/10.1111_1467-6478.00080.xml) | [TEI/biblStruct](tei-biblStruct/10.1111_1467-6478.00080.biblstruct.xml) | [Gold Standard](gold/10.1111_1467-6478.00080.xml)
Traceback (most recent call last):
File "c:\Users\boulanger\AppData\Local\miniconda3\Lib\site-packages\IPython\core\interactiveshell.py", line 3526, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "C:\Users\boulanger\AppData\Local\Temp\ipykernel_23656\3760719036.py", line 3, in <module>
create_all_gold_standards('tei-bibl-corrected',
File "c:\Users\boulanger\DataspellProjects\experiments\convert-anystyle-data\lib\gold_standard.py", line 223, in create_all_gold_standards
output_data = create_gold_standard(bibl_content, biblStruct_content, verbose=verbose)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\Users\boulanger\DataspellProjects\experiments\convert-anystyle-data\lib\gold_standard.py", line 70, in create_gold_standard
for parent_element in bibl_tree.xpath(bibl_parent_xpath, namespaces=ns):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "src\\lxml\\etree.pyx", line 1606, in lxml.etree._Element.xpath
File "src\\lxml\\xpath.pxi", line 290, in lxml.etree.XPathElementEvaluator.__call__
File "src\\lxml\\xpath.pxi", line 210, in lxml.etree._XPathEvaluatorBase._handle_result
lxml.etree.XPathEvalError: Invalid expression
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "c:\Users\boulanger\AppData\Local\miniconda3\Lib\site-packages\IPython\core\interactiveshell.py", line 2120, in showtraceback
stb = self.InteractiveTB.structured_traceback(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\Users\boulanger\AppData\Local\miniconda3\Lib\site-packages\IPython\core\ultratb.py", line 1435, in structured_traceback
return FormattedTB.structured_traceback(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\Users\boulanger\AppData\Local\miniconda3\Lib\site-packages\IPython\core\ultratb.py", line 1326, in structured_traceback
return VerboseTB.structured_traceback(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\Users\boulanger\AppData\Local\miniconda3\Lib\site-packages\IPython\core\ultratb.py", line 1173, in structured_traceback
formatted_exception = self.format_exception_as_a_whole(etype, evalue, etb, number_of_lines_of_context,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\Users\boulanger\AppData\Local\miniconda3\Lib\site-packages\IPython\core\ultratb.py", line 1088, in format_exception_as_a_whole
frames.append(self.format_record(record))
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\Users\boulanger\AppData\Local\miniconda3\Lib\site-packages\IPython\core\ultratb.py", line 970, in format_record
frame_info.lines, Colors, self.has_colors, lvals
^^^^^^^^^^^^^^^^
File "c:\Users\boulanger\AppData\Local\miniconda3\Lib\site-packages\IPython\core\ultratb.py", line 792, in lines
return self._sd.lines
^^^^^^^^^^^^^^
File "c:\Users\boulanger\AppData\Local\miniconda3\Lib\site-packages\stack_data\utils.py", line 145, in cached_property_wrapper
value = obj.__dict__[self.func.__name__] = self.func(obj)
^^^^^^^^^^^^^^
File "c:\Users\boulanger\AppData\Local\miniconda3\Lib\site-packages\stack_data\core.py", line 698, in lines
pieces = self.included_pieces
^^^^^^^^^^^^^^^^^^^^
File "c:\Users\boulanger\AppData\Local\miniconda3\Lib\site-packages\stack_data\utils.py", line 145, in cached_property_wrapper
value = obj.__dict__[self.func.__name__] = self.func(obj)
^^^^^^^^^^^^^^
File "c:\Users\boulanger\AppData\Local\miniconda3\Lib\site-packages\stack_data\core.py", line 649, in included_pieces
pos = scope_pieces.index(self.executing_piece)
^^^^^^^^^^^^^^^^^^^^
File "c:\Users\boulanger\AppData\Local\miniconda3\Lib\site-packages\stack_data\utils.py", line 145, in cached_property_wrapper
value = obj.__dict__[self.func.__name__] = self.func(obj)
^^^^^^^^^^^^^^
File "c:\Users\boulanger\AppData\Local\miniconda3\Lib\site-packages\stack_data\core.py", line 628, in executing_piece
return only(
^^^^^
File "c:\Users\boulanger\AppData\Local\miniconda3\Lib\site-packages\executing\executing.py", line 164, in only
raise NotOneValueFound('Expected one value, found 0')
executing.executing.NotOneValueFound: Expected one value, found 0
### Processing 10.1515_zfrs-1980-0103
Files: [TEI/bibl](tei-bibl-corrected/10.1515_zfrs-1980-0103.xml) | [TEI/biblStruct](tei-biblStruct/10.1515_zfrs-1980-0103.biblstruct.xml) | [Gold Standard](gold/10.1515_zfrs-1980-0103.xml)
### Processing 10.1515_zfrs-1980-0104
Files: [TEI/bibl](tei-bibl-corrected/10.1515_zfrs-1980-0104.xml) | [TEI/biblStruct](tei-biblStruct/10.1515_zfrs-1980-0104.biblstruct.xml) | [Gold Standard](gold/10.1515_zfrs-1980-0104.xml)
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment