in most cases, keys are us-ascii, but BEP-52 allows some
keys to be binary data.. It does appear that it only applies to
dictionaries that are under 'piece layer', so pass down the parent
key, and don't do encoding for the dictionary that is under it.
It also had a bug where if a dictionary key had code points >127,
it'd encode the values incorrectly, because it'd take the length of
the string and not the encoded string..
SQLite3 does not do joins in a sane manner, so we emulate them
w/ subqueries for a large boost. Not sure if adding distinct would
improve things or not, the query plan does not change between the
two (but the lower ops may), but in a quick test, it didn't seem
to make a difference (not evaluated statistically)...
add a class to emulate a file, and only store the part of the file
that was read/accessed... This reduces storing an 11MB file down
to under 100KB... It also allows tests to run w/o the whole file...
Put the original files in fixtures/original...
fix up a couple of issues w/ parsing CRW files, and also allow the
ability to skip parts of the CRW file... This allows skipping
large parts, like the CCD data and the large thumbnail..
This still needs some cleanup and additional tests.. This isn't
hooked into the testing system yet as I still haven't decided if
I'm going to commit fixtures or not (or maybe make this it's own
repo)..
IFD needs serious cleanup.. I should be using a classmethod instead
of the janky nextptr bs.
This imports magic.py from file-magic and merges magic_wrap.py into
it...
This also updates detect_from_filename to try w/ _COMPRESS, and if
it returns an error, normal mode. This is necessary as [some?] zip
files can be decompressed by gzip, but throws an error...
The original query applied a complicated test, which sqlite couldn't
tell if it applied to all..
In the case of any inclusion, it's easy, only search metadata, and match
to files.
If all exclusion, make two parts, the part w/ a metadata object that
doesn't have the exclusions, and the part w/o any metadata objects..
Both of these later two queries can be satified more simply and with
proper indices..
The old query might have worked fine on a more advanced DB, but was
necessary for decent performance..