John-Mark Gurney
5346ea06f0
forgot to include links to TIFF 6 standard...
1 year ago
John-Mark Gurney
50c35a8a85
add ifd builder (for testing), and make sure unknown enums are handled..
when I switched to the included enum, the unknown enum exception switched
from KeyError to ValueError, this now fixes that..
1 year ago
John-Mark Gurney
4f60a6ce2c
clean up tiff_ifd, use islice instead of complicated math..
1 year ago
John-Mark Gurney
67d44c1e95
fix string to follow the standard, other minor fixes..
fixing strings let me drop some special handlers..
Use BytesIO instead of fileoff so less code to maintain.. Maybe
want to just use bytes instead?
1 year ago
John-Mark Gurney
8e47cf3d6f
add the script to make the exif.jpeg file used for testing..
1 year ago
John-Mark Gurney
a02f32f410
add spec links, drop dead code, add JPEG EXIF parsing...
also minor code coverage tweak due to a bug
1 year ago
John-Mark Gurney
95fef909d8
checkpoint this work, confirmed parsing CRW and CR2 files...
This still needs some cleanup and additional tests.. This isn't
hooked into the testing system yet as I still haven't decided if
I'm going to commit fixtures or not (or maybe make this it's own
repo)..
IFD needs serious cleanup.. I should be using a classmethod instead
of the janky nextptr bs.
1 year ago
John-Mark Gurney
5a4fe83d64
make sure all files are processed..
this makes pre-caching files easier, and likely more what a user
expects..
1 year ago
John-Mark Gurney
573c975066
older versions of the library return this instead..
1 year ago
John-Mark Gurney
f529d0cafd
add support for zip archives...
This imports magic.py from file-magic and merges magic_wrap.py into
it...
This also updates detect_from_filename to try w/ _COMPRESS, and if
it returns an error, normal mode. This is necessary as [some?] zip
files can be decompressed by gzip, but throws an error...
1 year ago
John-Mark Gurney
d50be18450
add raw version of this.. will have magic_wrap embedded in next commit
1 year ago
John-Mark Gurney
53580234dc
ignore patch residue
1 year ago
John-Mark Gurney
152be62e3f
drop unneeded comment, autosaved w/ sqlite now, add new test to write
1 year ago
John-Mark Gurney
9279a559f5
significantly improve search results, especially in the exclusion case..
The original query applied a complicated test, which sqlite couldn't
tell if it applied to all..
In the case of any inclusion, it's easy, only search metadata, and match
to files.
If all exclusion, make two parts, the part w/ a metadata object that
doesn't have the exclusions, and the part w/o any metadata objects..
Both of these later two queries can be satified more simply and with
proper indices..
The old query might have worked fine on a more advanced DB, but was
necessary for decent performance..
1 year ago
John-Mark Gurney
d14fb005bb
on each successfull exit, do a little db maintainance...
1 year ago
John-Mark Gurney
6056bbbdc7
improve search performance.. minor dump improvements by uuid or hash..
dump improvements need tests..
1 year ago
John-Mark Gurney
035d354930
add RDF mapping description..
1 year ago
John-Mark Gurney
37bd97e7f9
include more detail on how to provide migrations..
1 year ago
John-Mark Gurney
0f633e175d
fix pre 2.0 sqlalchemy usage, and import medashare for orm..
1 year ago
John-Mark Gurney
5f5833a501
added this test earlier today...
1 year ago
John-Mark Gurney
4718e1113b
convert to 2.0, that was easier than I thought..
1 year ago
John-Mark Gurney
0b4c024c02
handle host id mapping on search results..
1 year ago
John-Mark Gurney
5d82d5930a
make mapping alone print out existing mappings..
1 year ago
John-Mark Gurney
81a0c7f77f
do this iteratively so the user gets results sooner..
1 year ago
John-Mark Gurney
0af2f89c67
add support for missing metadata objects, improve queries a bit..
Some queries weren't using the full power of SQL and doing some of
the selection in the code, fix that (by_hash and host mapping).
1 year ago
John-Mark Gurney
0802947635
add support for valueless searches..
1 year ago
John-Mark Gurney
2bec4e15c1
add support for searching, not indexed yet...
1 year ago
John-Mark Gurney
3579fd59a8
make sure subcommand help prints more useful help message..
1 year ago
John-Mark Gurney
204233629e
better halflife support, fix some minor bugs..
1 year ago
John-Mark Gurney
fda8fb6d07
support the cache having a half life...
This will help keep popular tags near the top, while expiring less
used tags
1 year ago
John-Mark Gurney
ef95e9f03f
prep for decay, add support for limit and refactor tests..
1 year ago
John-Mark Gurney
108d1dff3d
fix an issue with large work counts...
My testing machine has 10 cpus, and so didn't trigger the failure
where not all the work was submitted.. We need to pop the completed
work items, and keep doing the for loop while we have futures to
process... this submits and processes all work..
1 year ago
John-Mark Gurney
7212192801
use threads to read/hash multiple datablocks at a time..
as we might have a lot of work to submit, BUT it might fail early,
don't submit too much work early on, just for it to fail, so we
limit how much work we submit..
1 year ago
John-Mark Gurney
5ee796735b
add a class for hashing pieces in order...
This will be used to allow parallel processing of torrent pieces..
each piece of the torrent can be processed in parallel, and this
class will make sure that when processing the hash of a file in
the torrent, it will be hashed in the correct order...
1 year ago
John-Mark Gurney
df16a93fe1
don't call a function excessively...
setdefault works best for simple data types, calling the factory
function every time doesn't make sense...
1 year ago
John-Mark Gurney
f215c73705
a few notes, spell not in correctly..
1 year ago
John-Mark Gurney
000bf03652
basic docs on how to do development..
2 years ago
John-Mark Gurney
21b3122604
add some useful examples for this project to the template..
2 years ago
John-Mark Gurney
e51b7a62ac
cover a couple missed cases for _archive..
hide coverage in test code
2 years ago
John-Mark Gurney
b75a4d82da
add support for archives, such as tar.gz...
2 years ago
John-Mark Gurney
2c9deb7c58
add some more namespaces, and talk about uri for hashes
2 years ago
John-Mark Gurney
c2ac1018e5
fix issue w/ calculating size of last piece...
2 years ago
John-Mark Gurney
47763953aa
use file size from torrent instead of file, when file size mismatch
2 years ago
John-Mark Gurney
a0f3da4c4c
update debugging to add timestamps for making query times easier..
2 years ago
John-Mark Gurney
51c68c27c0
add useful debug info when a file cannot be found..
2 years ago
John-Mark Gurney
b5214e47a4
support getting file hashes at same time as verification...
2 years ago
John-Mark Gurney
33ea645f1d
make sure we iterate through possible returns from rglob
2 years ago
John-Mark Gurney
6c3c694f71
apparently shared cache is discouraged, and doesn't make sense anyways
2 years ago
John-Mark Gurney
f4fd04ec10
if container is already present and complete, skip it..
2 years ago
John-Mark Gurney
aeec98115e
bump cache to 15 entries..
2 years ago