Browse Source

simplify things a bit, primarily only metadata objects will ever

exist.
main
John-Mark Gurney 5 years ago
parent
commit
dd9bbca84d
2 changed files with 51 additions and 15 deletions
  1. +24
    -11
      README.md
  2. +27
    -4
      sample/file.md

+ 24
- 11
README.md View File

@@ -25,6 +25,10 @@ This work is inspired by my work on STIX, a Cyber Threat Intelligence standard,
9. i18n. Provide translations for fields as needed. Often movie titles will have different translations for different markets/languages. Actors may have different names (e.g. Chinese name vs English name).
10. Overlaying/replacing meta data from someone else's object. This may include deleting properties. Say an actor is missing, or you want to add them to it, or you've encoded the DVD, and you just link to someone's BluRay version.

## URN

Each object has a URN which uniquely describes it. XXX copy from STIX URN proposal, which is simlar to the magnet proposal.

## Types

Everything must have a type. Not having well defined types can lead to confusion and problems. Different encoding schemes have different ways of encoding types. If the encoding scheme has a native way to encode that type, it should be used. In some cases, e.g. JSON, there is no formal types beyond numbers and strings, and in this case, a type should (MUST? or via schemas?) be layered on top.
@@ -33,6 +37,18 @@ Everything must have a type. Not having well defined types can lead to confusio

Look at adding units.

### Hash String

The hash string is name of hash followed by a colon followed by the hex string.

The list of valid hashes is:
- sha256
- sha512

### Reference

A reference is the UUID optionally followed by two dashes (--) followed by the modified date of the object. The modified date is neccessary in some cases to know what version of the object is being referenced.

## Objects

These are the nodes that contain a majority of the data.
@@ -40,19 +56,25 @@ These are the nodes that contain a majority of the data.
### Common Properties

The following properties are present on all (most?) objects:
type The type of the object.
producer_ref UUID of the producer that created this object.
<signing> Add signing info.

### MetaData Object

Properties:
type 'metadata'
uuid UUIDv4
modified date of last modification
modified date of last modification of the metadata object
dc:<prop> A [Dublin Core] property
object_marking_refs Imported from [STIX v2.0 Part 1]: Section 3.1
granular_markings Imported from [STIX v2.0 Part 1]: Section 3.1
hashes A list of hash strings.
lang RFC XXXX language of the properties.
parent_ref UUIDv4 of the parent MetaData Object. Any properties on this object override the parent. (allow deletion via None/null?) Any missing properties are passed through to the parent for resolution.
mime-type The mime-type. If the set of bytes is polymorphic, there should be one for each "type".
uri List of URI's where the file may be located.
child_files A dictionary where the keys are the file names and the values are hash strings. (One issue w/ using hashes is that you can't tie YOUR idea of the metadata, but it also allows a person to have metadata about a file that is private and not be forced to share it, nor create a dummy object.)

Opinion Properties:
qualityrating On a scale from 1 (poor/terrible) to 5 (great/pristine), the subjective quality of the content.
@@ -69,21 +91,12 @@ If a property is imported from the blog itself, it is recommended to mark it as

Open Questions: When meta data is "declassified", how do you maintain a link to the classified version?

### Blob Object

Properties:
uuid UUIDv4
blobhash Hash of the blob. This needs to be indexed
metadata_ref UUID of the MetaData Object

This is the main mapping object. It maps a set of binary data to the MetaData object. All the data must be stored on the MetaData object. The reason it has a UUIDv4 is that this is your private mapping for the blog. You could possibly have multiple mappings, but most people will only have one, and this also allows you to publish your mapping, and coexist w/ other producer's mappings.

### File Object

Properties:
type 'file'
uuid UUIDv5 If the stats do not match, check hash, create a derivative blob object, possibly?
modified date of last modification of the object
blobhash Hash of the binary data.
stat Stats for the file, modified time, file size, used to detect when file has been changed/modified.

A file object references a blob Object, and contains information about the file name in the file system associated w/ the blob. This is used to speed up looking up blob objects.


+ 27
- 4
sample/file.md View File

@@ -1,25 +1,48 @@
Sample structure for sharing file information.

# Base file
# Example object hierarchy

secure hash of file
file name? I don't think it should be part of this, as the set of bytes could have any name.
metadata, e.g. code, mime-type, language, alt hashes
file -> metadata

# MetaData Object

secure hash of data
list or dict for hashes?

```
{
'id': 'uuid',
'dc:author': 'example author',
'hashes': [ 'sha256:xxxx' ],
'hash': 'sha256:xxxx',
'length': 1234,
'uri': {
'https://www.example.com/a/path/filename.txt'
}
xxxmetadata
}
```

# Location of file

```
{
'id': 'uuid5',
'uri': [
'https://example.com/path/to/file',
'magnet:?xxx',
'ipfs:xxx'
]
}
```

# Links to file from FS

hostname + path
link to base file

Why not use a file URI w/ host part? There is no UUID host name

How are these versioned? Are they? They need to be, via modified

```


Loading…
Cancel
Save