Skip to content

Commit b5202f8

Browse files
Merge pull request #373 from MITLibraries/USE-181-read-embeddings-with-tda
Use 181 read embeddings with tda
2 parents d0511c0 + fe79bd9 commit b5202f8

16 files changed

+1089
-1139
lines changed

.github/CODEOWNERS

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
# CODEOWNERS file (from GitHub template at
2+
# https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-code-owners)
3+
# Each line is a file pattern followed by one or more owners.
4+
5+
################################################################################
6+
# These owners will be the default owners for everything in the repo. This is commented
7+
# out in favor of using a team as the default (see below). It is left here as a comment
8+
# to indicate the primary expert for this code.
9+
# * @adamshire123
10+
11+
# Teams can be specified as code owners as well. Teams should be identified in
12+
# the format @org/team-name. Teams must have explicit write access to the
13+
# repository.
14+
* @mitlibraries/dataeng
15+
16+
# We set the senior engineer in the team as the owner of the CODEOWNERS file as
17+
# a layer of protection for unauthorized changes.
18+
/.github/CODEOWNERS @ghukill

.github/pull-request-template.md

Lines changed: 2 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -15,15 +15,5 @@ YES | NO
1515
### What are the relevant tickets?
1616
- Include links to Jira Software and/or Jira Service Management tickets here.
1717

18-
### Developer
19-
- [ ] All new ENV is documented in README
20-
- [ ] All new ENV has been added to staging and production environments
21-
- [ ] All related Jira tickets are linked in commit message(s)
22-
- [ ] Stakeholder approval has been confirmed (or is not needed)
23-
24-
### Code Reviewer(s)
25-
- [ ] The commit message is clear and follows our guidelines (not just this PR message)
26-
- [ ] There are appropriate tests covering any new functionality
27-
- [ ] The provided documentation is sufficient for understanding any new functionality introduced
28-
- [ ] Any manual tests have been performed **or** provided examples have been verified
29-
- [ ] New dependencies are appropriate or there were no changes
18+
### Code review
19+
* Code review best practices are documented [here](https://mitlibraries.github.io/guides/collaboration/code_review.html) and you are encouraged to have a constructive dialogue with your reviewers about their preferences and expectations.

.pre-commit-config.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,6 @@ repos:
2424
types: ["python"]
2525
- id: pip-audit
2626
name: pip-audit
27-
entry: pipenv run pip-audit --ignore-vuln GHSA-4xh5-x5gv-qwph
27+
entry: pipenv run pip-audit
2828
language: system
2929
pass_filenames: false

Makefile

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
### This is the Terraform-generated header for timdex-index-manager-dev ###
2+
.PHONY: test
23
ECR_NAME_DEV:=timdex-index-manager-dev
34
ECR_URL_DEV:=222053980223.dkr.ecr.us-east-1.amazonaws.com/timdex-index-manager-dev
45
### End of Terraform-generated header ###
@@ -28,7 +29,7 @@ update: install # Update Python dependencies
2829
######################
2930

3031
test: # Run tests and print a coverage report
31-
pipenv run coverage run --source=tim -m pytest -vv
32+
pipenv run coverage run --source=tim -m pytest -vv
3233
pipenv run coverage report -m
3334

3435
coveralls: test # Write coverage data to an LCOV report
@@ -50,7 +51,7 @@ ruff: # Run 'ruff' linter and print a preview of errors
5051
pipenv run ruff check .
5152

5253
safety: # Check for security vulnerabilities and verify Pipfile.lock is up-to-date
53-
pipenv run pip-audit --ignore-vuln GHSA-4xh5-x5gv-qwph
54+
pipenv run pip-audit
5455
pipenv verify
5556

5657
lint-apply: black-apply ruff-apply # Apply changes with 'black' and resolve 'fixable errors' with 'ruff'

Pipfile.lock

Lines changed: 809 additions & 1078 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

config/opensearch_mappings.json

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -132,6 +132,9 @@
132132
"edition": {
133133
"type": "text"
134134
},
135+
"embedding_full_record": {
136+
"type": "rank_features"
137+
},
135138
"file_formats": {
136139
"type": "keyword",
137140
"normalizer": "lowercase"
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
interactions:
2+
- request:
3+
body: null
4+
headers:
5+
content-type:
6+
- application/json
7+
user-agent:
8+
- opensearch-py/2.8.0 (Python 3.12.11)
9+
method: GET
10+
uri: http://localhost:9200/_cat/aliases?format=json
11+
response:
12+
body:
13+
string: '[{"alias":"all-current","index":"libguides-2025-12-11t16-36-09","filter":"-","routing.index":"-","routing.search":"-","is_write_index":"-"},{"alias":"libguides","index":"libguides-2025-12-11t16-36-09","filter":"-","routing.index":"-","routing.search":"-","is_write_index":"-"},{"alias":"all-current","index":"test-index-2025-12-11t16-58-08","filter":"-","routing.index":"-","routing.search":"-","is_write_index":"-"},{"alias":"test-index","index":"test-index-2025-12-11t16-58-08","filter":"-","routing.index":"-","routing.search":"-","is_write_index":"-"},{"alias":".kibana","index":".kibana_1","filter":"-","routing.index":"-","routing.search":"-","is_write_index":"-"}]'
14+
headers:
15+
content-length:
16+
- '671'
17+
content-type:
18+
- application/json; charset=UTF-8
19+
status:
20+
code: 200
21+
message: OK
22+
version: 1
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
interactions:
2+
- request:
3+
body: '{"update":{"_id":"i-am-not-found","_index":"test-index"}}
4+
5+
{"doc":{"timdex_record_id":"i-am-not-found","title":"Materials Science & Engineering
6+
(UPDATED)"}}
7+
8+
'
9+
headers:
10+
Content-Length:
11+
- '156'
12+
content-type:
13+
- application/json
14+
user-agent:
15+
- opensearch-py/2.8.0 (Python 3.12.11)
16+
method: POST
17+
uri: http://localhost:9200/_bulk
18+
response:
19+
body:
20+
string: '{"took":9,"errors":true,"items":[{"update":{"_index":"test-index-2025-12-11t16-58-08","_id":"i-am-not-found","status":404,"error":{"type":"document_missing_exception","reason":"[i-am-not-found]:
21+
document missing","index":"test-index-2025-12-11t16-58-08","shard":"0","index_uuid":"in04_JvQS5qqCvUXeZta_g"}}}]}'
22+
headers:
23+
content-length:
24+
- '308'
25+
content-type:
26+
- application/json; charset=UTF-8
27+
status:
28+
code: 200
29+
message: OK
30+
version: 1
Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
interactions:
2+
- request:
3+
body: '{"update":{"_id":"libguides:guides-175846","_index":"test-index"}}
4+
5+
{"doc":{"timdex_record_id":"libguides:guides-175846","title":"Materials Science
6+
& Engineering (UPDATED)"}}
7+
8+
'
9+
headers:
10+
Content-Length:
11+
- '174'
12+
content-type:
13+
- application/json
14+
user-agent:
15+
- opensearch-py/2.8.0 (Python 3.12.11)
16+
method: POST
17+
uri: http://localhost:9200/_bulk
18+
response:
19+
body:
20+
string: '{"took":7,"errors":false,"items":[{"update":{"_index":"test-index-2025-12-11t16-58-08","_id":"libguides:guides-175846","_version":4,"result":"updated","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":7,"_primary_term":1,"status":200}}]}'
21+
headers:
22+
content-length:
23+
- '245'
24+
content-type:
25+
- application/json; charset=UTF-8
26+
status:
27+
code: 200
28+
message: OK
29+
- request:
30+
body: null
31+
headers:
32+
Content-Length:
33+
- '0'
34+
content-type:
35+
- application/json
36+
user-agent:
37+
- opensearch-py/2.8.0 (Python 3.12.2)
38+
method: POST
39+
uri: http://localhost:9200/test-index/_refresh
40+
response:
41+
body:
42+
string: '{"_shards":{"total":2,"successful":1,"failed":0}}'
43+
headers:
44+
content-length:
45+
- '49'
46+
content-type:
47+
- application/json; charset=UTF-8
48+
status:
49+
code: 200
50+
message: OK
51+
version: 1

0 commit comments

Comments
 (0)