The audiotools.bitstream module contains objects for parsing binary data. Unlike Python’s built-in struct module, these routines are specialized to handle data that’s not strictly byte-aligned.
Given a format string as used by BitstreamReader.parse() or BitstreamWriter.build(), returns the size of that string as an integer number of bits that would be read from or written to the stream.
>>> format_size("3u 4s 36U")
43
Given a format string as used by BitstreamReader.parse(), whether the data is little-endian, and a string of binary data, returns a list of values as would be returned by BitstreamReader.parse().
This is roughly equivalent to:
>>> return BitstreamReader(StringIO(data), is_little_endian).parse(format_string)
Given a format string as used by BitstreamWriter.build(), whether the data is little-endian, and a sequence of Python values, returns the binary string as would be returned by BitstreamWriter.build().
This is roughly equivalent to
>>> s = StringIO()
>>> BitstreamWriter(s, is_little_endian).build(format_string, values)
>>> return s
This is a file-like object for pulling individual bits or bytes out of a larger binary file stream.
Warning
BitstreamReaders process the given file object in chunks of the given buffer size. This means the position of the file is likely to be further along than one might expect given the number of bits already read. The BitstreamReader’s getpos, setpos and seek methods will handle buffering correctly and are preferable to intermingling BitstreamReader and file operations.
file may be a regular file object, a file-like object with read and close methods, or a plain string.
When operating on a raw file object (such as one opened with open()) this uses a single byte buffer. This allows the underlying file to be seeked safely whenever the BitstreamReader is byte-aligned.
However, when operating on a Python-based file object (with read() and close() methods) this uses an internal string up to buffer_size bytes large in order to minimize Python function calls.
is_little_endian indicates which endianness format to use when consuming bits. True for big-endian streams, False for little-endian.
Given a number of bits to read from the stream, returns an unsigned integer. May raise IOError if an error occurs reading the stream.
Given a number of bits to read from the stream as a two’s complement value, returns a signed integer. May raise IOError if an error occurs reading the stream.
Skips the given number of bits in the stream as if read. May raise IOError if an error occurs reading the stream.
Skips the given number of bytes in the stream as if read. May raise IOError if an error occurs reading the stream.
Reads the number of bits until the next stop_bit, which must be 0 or 1. Returns that count as an unsigned integer. May raise IOError if an error occurs reading the stream.
Skips a number of bits until the next stop_bit, which must be 0 or 1. May raise IOError if an error occurs reading the stream.
Discards bits as necessary to position the stream on a byte boundary.
Returns True if the stream is positioned on a byte boundary.
Given a format string representing a set of individual reads, returns a list of those reads.
format | method performed |
---|---|
“#u” | read(#) |
“#s” | read_signed(#) |
“#p” | skip(#) |
“#P” | skip_bytes(#) |
“#b” | read_bytes(#) |
“a” | byte_align() |
For instance:
>>> r.parse("3u 4s 36U") == [r.read(3), r.read_signed(4), r.read(36)]
The * format multiplies the next format by the given amount. For example, to read 4, signed 8 bit values:
>>> r.parse("4* 8s") == [r.read_signed(8) for i in range(4)]
May raise IOError if an error occurs reading the stream.
Given a HuffmanTree object, returns the next Huffman code from the stream as defined in the tree. May raise IOError if an error occurs reading the stream.
Pushes a single bit back onto the stream, which must be 0 or 1. Only a single bit is guaranteed to be unreadable.
Returns the given number of 8-bit bytes from the stream as a binary string. May raise IOError if an error occurs reading the stream.
Sets the stream’s endianness where False indicates big-endian, while True indicates little-endian. The stream is automatically byte-aligned prior to changing its byte order.
Returns a BitstreamReaderPosition object of the stream’s current position. May raise IOError if an error occurs getting the position.
Given a BitstreamReaderPosition object, sets the stream to that position. The position must be one returned by this object’s BitstreamReader.getpos() method; one cannot apply a position from one reader to a different one. May raise IOError if an error occurs setting the position.
Given an integer position value, positions the stream at the given byte relative to whence, which may be 0 for the beginning of the stream (the default), 1 for the current position and 2 for the stream end.
Adds a callable function to the stream’s callback stack. callback(b) takes a single byte as an argument. This callback is called upon each byte read from the stream. If multiple callbacks are added, they are all called in reverse order.
Calls all the callbacks on the stream’s callback stack with the given byte, as if it had been read from the stream.
Removes and returns the most recently added function from the callback stack.
Returns a new BitstreamReader object which contains bytes amount of data read from the current stream and defined with the current stream’s endianness. May raise an IOError if the current stream has insufficient bytes. Any callbacks defined in the current stream are applied to the bytes read for the substream when this method is called. Any marks or callbacks in the current stream are not transferred to the substream. In all other respects, the substream acts like any other BitstreamReader. However, attempting to have the substream read beyond its defined byte count will trigger IOError exceptions.
Closes the stream and any underlying file object, by calling its close method.
Returns the reader’s context manager.
Exits the reader’s context manager by calling file.close() on the wrapped file object. If one wishes to keep the stream open for further reading, don’t use a context manager and simply delete the reader object. But again, be aware that buffering may make its current position different than one might expect.
This is a file-like object for pushing individual bits or bytes into a larger binary file stream.
Warning
BitstreamWriters process the given file object in chunks of the given buffer size. This means the position of the file is likely to be not as far along as one might expect given the number of bits already written. The BitstreamWriters’s getpos and setpos methods will handle buffering correctly and are preferable to intermingling BitstreamWriter and file operations.
When operating on a raw file object (such as one opened with open()) this uses a single byte buffer. This allows the underling file to be seeked safely whenever BitstreamWriter is byte-aligned. However, when operating on a Python-based file object (with write() and close() methods) this uses an internal string up to buffer_size bytes large in order to minimize Python function calls.
Writes the given unsigned integer value to the stream using the given number of bits. May raise IOError if an error occurs writing the stream.
Writes the given signed integer value to the stream using the given number of bits. May raise IOError if an error occurs writing the stream.
If stop_bit is 1, writes value number of 0 bits to the stream followed by a 1 bit. If stop_bit is 0, writes value number of 1 bits to the stream followed by a 0 bit. May raise IOError if an error occurs writing the stream.
Given a HuffmanTree object and an integer value to write, determines the proper output code and writes it to disk. Raises ValueError if the integer value is not present in the tree.
Writes 0 bits as necessary until the stream is aligned on a byte boundary. May raise IOError if an error occurs writing the stream.
Returns True if the stream is positioned on a byte boundary.
Given a format string representing a set of individual writes, and a list of values to write, performs those writes to the stream.
format | value | method performed |
---|---|---|
“#u” | unsigned int | write(#, u) |
“#s” | signed int | write(#, s) |
“#p” | N/A | write(#, 0) |
“#P” | N/A | write(# * 8, 0) |
“#b” | string | write_bytes(#, s) |
“a” | N/A | byte_align() |
For instance:
>>> w.build("3u 4s 36U", [1, -2, 3L])
is equivalent to:
>>> w.write(3,1)
>>> w.write_signed(4, -2)
>>> w.write(36, 3L)
The * format multiplies the next format by the given amount.
>>> r.build("4* 8s", [-2, -1, 0, 1])
is equivalent to:
>>> w.write_signed(8, -2)
>>> w.write_signed(8, -1)
>>> w.write_signed(8, 0)
>>> w.write_signed(8, 1)
May raise IOError if an error occurs writing the stream.
Writes the given binary string to the stream with a number of bytes equal to its length. May raise IOError if an error occurs writing the stream.
Flushes cached bytes to the stream. Partially written bytes are not flushed to the stream. May raise IOError if an error occurs writing the stream.
Sets the stream’s endianness where False indicates big-endian, while True indicates little-endian. The stream is automatically byte-aligned prior to changing its byte order.
Adds a callable function to the stream’s callback stack. callback(b) takes a single byte as an argument. This callback is called upon each byte written to the stream. If multiple callbacks are added, they are all called in reverse order.
Calls all the callbacks on the stream’s callback stack with the given byte, as if it had been written to the stream.
Removes and returns the most recently added function from the callback stack.
Returns a BitstreamWriterPosition object of the stream’s current position. May raise IOError if the stream is not byte-aligned or an error occurs getting the position.
Given a BitstreamWriterPosition object, sets the stream to that position. The position must be one returned by this object’s BitstreamWriter.getpos() method; one cannot apply a position from one writer to a different one. May raise IOError if the stream is not byte-aligned or an error occurs setting the position.
Flushes cached bytes to the stream and closes the underlying file object with its close method.
Returns the writers’s context manager.
Exits the writer’s context manager by calling file.close() on the wrapped file object. If one wishes to keep the stream open for further writing, don’t use a context manager and simply delete the writer object. But again, be aware that buffering may make its current position different than one might expect.
This is a file-like object for recording the writing of individual bits or bytes, for possible output into a BitstreamWriter.
is_little_endian indicates whether to record a big-endian or little-endian output stream.
Records the given unsigned integer value to the stream using the given number of bits. Bits must be: 0 <= bits <= 32 . Value must be: 0 <= value < (2 ** bits) .
Records the given unsigned integer value to the stream using the given number of bits. Bits must be: 0 <= bits <= 64 . Value must be: 0 <= value < (2 ** bits) .
Records the given signed integer value to the stream using the given number of bits. Bits must be: 0 <= bits <= 32 . Value must be: -(2 ** (bits - 1)) <= value < 2 ** (bits - 1) .
Records the given signed integer value to the stream using the given number of bits. Bits must be: 0 <= bits <= 64 . Value must be: -(2 ** (bits - 1)) <= value < 2 ** (bits - 1) .
If stop_bit is 1, records value number of 0 bits to the stream followed by a 1 bit. If stop_bit is 0, records value number of 1 bits to the stream followed by a 0 bit.
Given a HuffmanTree object and an integer value to write, determines the proper output code and records it for writing. Raises ValueError if the integer value is not present in the tree.
Records 0 bits as necessary until the stream is aligned on a byte boundary.
Returns True if the stream is positioned on a byte boundary.
Given a format string representing a set of individual writes, and a list of values to write, records those writes to the stream.
format | value | method performed |
---|---|---|
“#u” | unsigned int | write(#, u) |
“#s” | signed int | write(#, s) |
“#U” | unsigned long | write64(#, ul) |
“#S” | signed long | write_signed64(#, sl) |
“#p” | N/A | write(#, 0) |
“#P” | N/A | write(# * 8, 0) |
“#b” | string | write_bytes(#, s) |
“a” | N/A | byte_align() |
For instance:
>>> w.build("3u 4s 36U", [1, -2, 3L])
is equivalent to:
>>> w.write(3,1)
>>> w.write_signed(4, -2)
>>> w.write64(36, 3L)
Records the given binary string to the stream with a number of bytes equal to its length.
Sets the stream’s endianness where False indicates big-endian, while True indicates little-endian. The stream is automatically byte-aligned prior to changing its byte order.
Adds a callable function to the stream’s callback stack. callback(b) takes a single byte as an argument. This callback is called upon each byte recorded to the stream. If multiple callbacks are added, they are all called in reverse order.
Calls all the callbacks on the stream’s callback stack with the given byte, as if it had been recorded to the stream.
Removes and returns the most recently added function from the callback stack.
Returns a BitstreamWriterPosition object of the stream’s current position. May raise IOError if the stream is not byte-aligned or an error occurs getting the position.
Given a BitstreamWriterPosition object, sets the stream to that position. The position must be one returned by this object’s BitstreamRecorder.getpos() method; one cannot apply a position from one writer to a different one. May raise IOError if the stream is not byte-aligned or an error occurs setting the position.
Does nothing. This is merely a placeholder for compatibility with BitstreamWriter.
Does nothing. This is merely a placeholder for compatibility with BitstreamWriter.
Returns the count of bits recorded as an integer.
Returns the count of bytes recorded as an integer.
Given a BitstreamWriter or BitstreamRecorder object, copies all recorded output to that stream, including any partially written bytes.
Returns a binary string of recorded data, not including any partially written bytes.
Erases all recorded data and resets the stream for fresh recording.
Swaps the recorded data with the given BitstreamRecorder object. This is often useful for finding the best output given many possible input permutations:
>>> best_case = BitstreamRecorder(False)
>>> write_data(best_case, default_arguments)
>>> next_best = BitstreamRecorder(False)
>>> for arguments in argument_list:
... next_best.reset()
... write_data(next_best, arguments)
... if (next_best.bits() < best_case.bits()):
... next_best.swap(best_case)
>>> best_case.copy(output_writer)
Unlike replacing the best_case object with next_best, swapping and resetting allows BitstreamRecorder to reuse allocated data buffers.
Returns the recorder’s context manager.
Exits the recorders’s context manager.
This is a compiled Huffman tree for use by BitstreamReader and BitstreamWriter.
bits_list is a list of 0 or 1 values which, when read from the stream on a bit-by-bit basis, result in the final integer value.
For example, given the following Huffman tree definition:
we define our Huffman tree for a big-endian stream as follows:
>>> HuffmanTree([(1, ), 1,
... (0, 1), 2,
... (0, 0, 1), 3,
... (0, 0, 0), 4], False)
Note that the bits in the tree are always consumed from the least-significant position to most-significant. This may differ from how they are consumed from the stream based on its is_little_endian value.
The resulting object is passed to BitstreamReader.read_huffman_code() to read the next value from a stream, and to BitstreamWriter.write_huffman_code() to write a given value to the stream.
May raise ValueError if the tree is incorrectly specified.