- Jun 07, 2015
-
-
Scott Vokes authored
-
Scott Vokes authored
Fix typo in CONTRIBUTING.md
-
Aria Stewart authored
-
Scott Vokes authored
API Changes: 1. The limits for the window and lookahead size options have changed. The minimum lookahead bits option (-l) is now 3 (previously, 2), and the maximum is the window bits - 1 (previously, equal to the window bits). Bug Fixes: 1. There was an edge case in the internal buffering where window bits of 4 and lookahead bits of 2 could cause up to 3 bytes of data to be lost from the end of the bytestream. Thanks to @unixdj for reporting this, with detailed notes on how to reproduce it. 2. Using a lookahead bits setting equal to the window bits (e.g. -w 8 -l 8) could lead to an infinite loop. Aside from fixing the root cause, now the lookahead size must be smaller than the window size -- an equal lookahead size will always lead to worse compression, and should probably have never been a valid setting. Found by @unixdj. 3. A few compiler warnings due to signed/unsigned warnings have been resolved. Other Improvements: 1. The logic for determining whether a pattern substitution is worthwhile has been improved, potentially leading to better compression. Decoders from earlier releases should have no problems processing the output from newer encoders. 2. A benchmarking suite has been added, to measure the impact of changes on speed and compression ratios. 3. The encoder and decoder state machines have been streamlined, reducing overall complexity. Thanks again to @unixdj, who saw opportunities to simplify things while investigating other issues, and did most of the work. 4. Notes on example usage and configuration have been added to the README. 5. Initial notes for potential contributors (CONTRIBUTING.md) have been added. 6. The theft-based property tests are now exploring the full state space of the program. Previously, some of the test cases weren't varying the window or lookahead sizes, only the input data. This led to a couple bugs (1 and 2 above) going undetected. Now all three are changed, as well as the input buffer size for the decoder.
-
Scott Vokes authored
Three cheers for actively maintained projects.
-
Scott Vokes authored
-
Scott Vokes authored
-
Scott Vokes authored
Conflicts: heatshrink_encoder.c Merge several PRs and other changes on `develop` over in preparation for 0.4.0 release. Closes #16. (Addressed directly by 15ebaddc, and indirectly by several other changes.)
-
Scott Vokes authored
-
Scott Vokes authored
Closes #26.
-
Scott Vokes authored
-
Vadim Vygonets authored
Fix the bug where last 7 bits are not processed. Test case: $ echo -n aaaa | ./heatshrink -w4 -l2 | ./heatshrink -dw4 -l2 | hexdump -C expected output: 00000000 61 61 61 61 |aaaa| 00000004 bug: 00000000 61 |a| 00000001 Conflicts: heatshrink_decoder.c
-
Scott Vokes authored
-
Scott Vokes authored
Download and cache the Canterbury Corpus, compress files in it with various settings, and report on the compression ratios. Closes #24.
-
- May 31, 2015
-
-
Scott Vokes authored
-
Scott Vokes authored
This should stress the suspending/resuming as buffers fill more often.
-
Vadim Vygonets authored
bit_accumulator was not preserved between calls.
-
Vadim Vygonets authored
It's only called with count <= 8.
-
Scott Vokes authored
Delete some code that is no longer being used, but did not get deleted during merging because of cherry-picking / out of order integration of several concurrent pull requests. Closes #22.
-
Vadim Vygonets authored
-
Vadim Vygonets authored
Conflicts: heatshrink_encoder.c
-
Vadim Vygonets authored
-
Vadim Vygonets authored
Apparently, gcc only warns about [u]int16_t comparisons on platforms where sizeof(int) == sizeof(int16_t). Seems like integer type promotion happens here, and gcc only warns when it changes the comparison result. The maximum buffer length is 1<<16 (when window size is 15), in which case end takes values >=0x8000, and start between 0 and 0x8000, inclusive. Which means we really want to compare unsigned pos to signed start (except when start is 0x8000), so comparing them as either int16_t or uint16_t would be incorrect. But the difference between end-1 and start is always strictly less than 1<<widow_size, which makes it positive when interpreted as int16_t. The only other case in which the difference between pos and start may reach a positive value on 16-bit platforms is when indexing is used, the window size is 15 and the backlog is empty, making start 0x8000, and pos gets the sentinel end-of-list value 0xFFFF from the index. This case is handled by changing the sentinel value to 0 when the backlog is not full. (Updated, see below.) Conflicts: heatshrink_encoder.c EDIT: Since commit 7e818d55 eliminates the backlog full calculations by initially filling the backlog with zeroes (i.e., the backlog is always full), the special case sentinel value of 0 is no longer necessary. Closes #19.
-
Vadim Vygonets authored
-
Scott Vokes authored
Closes #18.
-
Vadim Vygonets authored
-
Scott Vokes authored
Using a lookahead size equal to the window size could previously lead to an infinite loop (discovered and fixed by @unixdj, thanks). There really isn't any benefit to having a lookahead size that large, as it only applies when the entire input stream is the same. In all other cases, it makes the compression ratio worse. Closes #20.
-
Vadim Vygonets authored
BUG: When window size equals lookahead, the encoder enters an endless loop after WINDOW_SIZE bytes are sunk into the buffer, unless at EOF. Test: $ echo -n 0123456789abcdefX | ./heatshrink -w 4 -l 4 >/dev/null
-
Scott Vokes authored
Closes #21
-
Scott Vokes authored
-
Vadim Vygonets authored
Closes #21
-
Scott Vokes authored
-
- May 24, 2015
-
-
Scott Vokes authored
Fix signed/unsigned comparison
-
Scott Vokes authored
As reported by @unixdj, there is a case where a few bytes can be dropped from the end of the bytestream when used with a window_sz2 of 4 and lookahead_sz2 of 2 (-w 4 -l 2): $ echo -n aaaa | ./heatshrink -e -w4 -l2 | ./heatshrink -d -w4 -l2 a # should be "aaaa" While st_check_for_input can treat 7 bits as sufficient input when -w is 4 and -l is 2, that creates a corresponding issue where 1 spillover bit from the previous byte leads to filler of 0b000 0000, which is interpreted as a marker to repeat (0b0) from 1 byte back (0b0000) for 1 byte (0b00), leading to a duplication of the last byte of input. Using a w,l pair where w+l < 7 leads to trailing bits that are ambiguous, so raise the minimum lookahead bits to 3. This problem does not occur with -w 4 -l 3, or any other valid config.
-
Scott Vokes authored
This is to ensure the full state space is explored. malloc (and re-use, after zeroing) large in-memory buffers for use in the tests, as these will not potentially fill the way smaller stack-allocated ones can.
-
Scott Vokes authored
-
- May 14, 2015
-
-
Scott Vokes authored
Thanks to @unixdj for finding this bug and providing very detailed info about how to reproduce it!
-
- May 13, 2015
-
-
Vadim Vygonets authored
-
- May 11, 2015
-
-
Vadim Vygonets authored
-
Scott Vokes authored
Fix debug output and asserts
-