- Nov 01, 2015
-
-
Scott Vokes authored
The bound can include (1 << window size), since the offset is shifted by 1. There is no point in encoding a 0-byte window, and increasing the size for possible backreferences improves the correction ratio. Reported by @timothyzillon, @sw17ch, @jibsen.
-
- May 31, 2015
-
-
Scott Vokes authored
Delete some code that is no longer being used, but did not get deleted during merging because of cherry-picking / out of order integration of several concurrent pull requests. Closes #22.
-
Vadim Vygonets authored
-
Vadim Vygonets authored
Conflicts: heatshrink_encoder.c
-
Vadim Vygonets authored
Apparently, gcc only warns about [u]int16_t comparisons on platforms where sizeof(int) == sizeof(int16_t). Seems like integer type promotion happens here, and gcc only warns when it changes the comparison result. The maximum buffer length is 1<<16 (when window size is 15), in which case end takes values >=0x8000, and start between 0 and 0x8000, inclusive. Which means we really want to compare unsigned pos to signed start (except when start is 0x8000), so comparing them as either int16_t or uint16_t would be incorrect. But the difference between end-1 and start is always strictly less than 1<<widow_size, which makes it positive when interpreted as int16_t. The only other case in which the difference between pos and start may reach a positive value on 16-bit platforms is when indexing is used, the window size is 15 and the backlog is empty, making start 0x8000, and pos gets the sentinel end-of-list value 0xFFFF from the index. This case is handled by changing the sentinel value to 0 when the backlog is not full. (Updated, see below.) Conflicts: heatshrink_encoder.c EDIT: Since commit 7e818d55 eliminates the backlog full calculations by initially filling the backlog with zeroes (i.e., the backlog is always full), the special case sentinel value of 0 is no longer necessary. Closes #19.
-
Vadim Vygonets authored
-
Scott Vokes authored
Closes #18.
-
Vadim Vygonets authored
-
Scott Vokes authored
Using a lookahead size equal to the window size could previously lead to an infinite loop (discovered and fixed by @unixdj, thanks). There really isn't any benefit to having a lookahead size that large, as it only applies when the entire input stream is the same. In all other cases, it makes the compression ratio worse. Closes #20.
-
Vadim Vygonets authored
BUG: When window size equals lookahead, the encoder enters an endless loop after WINDOW_SIZE bytes are sunk into the buffer, unless at EOF. Test: $ echo -n 0123456789abcdefX | ./heatshrink -w 4 -l 4 >/dev/null
-
Scott Vokes authored
-
Vadim Vygonets authored
Closes #21
-
Scott Vokes authored
-
- May 13, 2015
-
-
Vadim Vygonets authored
-
- May 11, 2015
-
-
Vadim Vygonets authored
-
Vadim Vygonets authored
-
- Jan 14, 2014
-
-
Scott Vokes authored
Only check for matches that can be improvements.
-
Scott Vokes authored
Only test whether len is > match_maxlen when at least the first byte matches, and use a pointer to curry in the current offset, saving an add per compare.
-
Scott Vokes authored
This eliminates more unnecessary compares, speeding things up further. (Thanks to user jibsen on hacker news for the original suggestion.)
-
- Dec 28, 2013
-
-
Scott Vokes authored
Also, update index type to int16_t.
-
Scott Vokes authored
-
Scott Vokes authored
Eliminate some unnecessary comparisons by making the position in the index signed, and treating any negative value as not found, not just ((uint16_t)-1). Also, decrease HEATSHRINK_MAX_WINDOW_BITS to 15, so index values with the highest bit set can be used as working space for index optimization on the fly (some day). Use 'if length >= 3' instead of 'length > 2' for the break even point, sinec it makes the intent clearer, and pull the comparison outside of the inner loop. Increase HEATSHRINK_MIN_LOOKAHEAD_BITS to 2, since -l 1 won't compress. Also, eliminate the corresponding MAX #define, since it has to be <= the window size anyway. Use a separate *pt == *pt2 instead of pt[0] == pt2[0] for comparison of the leading byte, which should allow for better branch prediction. Eliminate 'needle_index' in the inner loop -- it's only aliasing 'end'. Also, add comments about the indexing strategy.
-
Scott Vokes authored
This can save a significant amount of time in push_bits, particularly if -w is 8.
-
Scott Vokes authored
-
Scott Vokes authored
1. Instead of storing an offset to the previous matching byte in the index, store the next matching byte bigram. Since the compression doesn't break even unless at least 3 bytes match, any single-byte matches in the index should just be skipped. Since this eliminates these false positives in one pass, it also removes work that would otherwise be done during every walk of the index's offset chain - possibly several times. This greatly reduces the time spent in find_longest_match(). EDIT: This prematurely discards information from the index, damaging the overall compression effectiveness. Some of the links being skipped are still necessary for other chains, so they cannot be eliminated this way. It's still possible dynamic programming can be used for this, but the strategy in this commit doesn't quite work. 2. Eliminate the test of the first byte, since indexing already incorporates this comparison ahead of time.
-
- Dec 21, 2013
-
-
Scott Vokes authored
-
- Dec 19, 2013
-
-
Scott Vokes authored
*comment on rebase*
-
Scott Vokes authored
-
Scott Vokes authored
-
- May 13, 2013
-
-
Scott Vokes authored
This eliminates the possibility of infinite loops caused by counters overflowing and preventing the output buffer full or input exhausted conditions from being reached.
-
- May 12, 2013
-
-
Scott Vokes authored
Since they're only stored as `uint16_t`s, disallow overly large sizes.
-
- May 11, 2013
-
-
Scott Vokes authored
-
- Mar 13, 2013
-
-
Scott Vokes authored
-
Scott Vokes authored
-