Interface Encoding<T>

Type Parameters:
T - The associated Java type
All Superinterfaces:
Comparator<T>, NaturalSortAware, Serializable
All Known Implementing Classes:
AbstractEncoding, ArrayEncoding, Base64ArrayEncoding, BigDecimalEncoding, BigIntegerEncoding, BitSetEncoding, BooleanArrayEncoding, BooleanEncoding, ByteArrayEncoding, ByteEncoding, CharacterArrayEncoding, CharacterEncoding, Concat2Encoding, Concat3Encoding, Concat4Encoding, Concat5Encoding, ConvertedEncoding, DateEncoding, DoubleArrayEncoding, DoubleEncoding, DoubleSummaryStatisticsEncoding, DurationEncoding, EnumValueEncoding, FileEncoding, FloatArrayEncoding, FloatEncoding, Inet4AddressEncoding, Inet6AddressEncoding, InetAddressEncoding, InstantEncoding, IntegerArrayEncoding, IntegerEncoding, IntegralArrayEncoding, IntegralEncoding, InternetAddressEncoding, IntSummaryStatisticsEncoding, LocalDateEncoding, LocalDateTimeEncoding, LocalTimeEncoding, LongArrayEncoding, LongEncoding, LongSummaryStatisticsEncoding, MonthDayEncoding, NullSafeEncoding, NumberEncoding, ObjectArrayEncoding, ObjIdEncoding, OffsetDateTimeEncoding, OffsetTimeEncoding, PatternEncoding, PeriodEncoding, PrimitiveEncoding, PrimitiveWrapperEncoding, ReferenceEncoding, ShortArrayEncoding, ShortEncoding, StringConvertedEncoding, StringEncoding, Tuple2Encoding, Tuple3Encoding, Tuple4Encoding, Tuple5Encoding, TupleEncoding, UnsignedIntEncoding, URIEncoding, UUIDEncoding, VoidEncoding, YearEncoding, YearMonthEncoding, ZonedDateTimeEncoding, ZoneIdEncoding, ZoneOffsetEncoding

public interface Encoding<T> extends Comparator<T>, NaturalSortAware, Serializable
A range of values of some Java type, along with string and binary encodings and a total ordering of those values.

Encoding's are used to map between instances of some Java type and the byte[] encodings of those instances stored in a Permazen database. The byte[] encoding defines the database sort order (via unsigned lexicographical ordering), and this same ordering is reflected in Java via compare().

An Encoding also defines a mapping between Java instances and String values.

Instances may have an associated EncodingId, which is a globally unique URN-style identifier that allows the encoding to be referred to by name (e.g., in an EncodingRegistry). Encodings with no EncodingId are called anonymous.

Encodings must satsify these requirements:

  • Instances have an associated Java type which can represent any of the encoding's supported values. However, an encoding is not required to support every instance of the Java type. For example, there can be an encoding of Integer that only supports non-negative values.
  • Instances totally order their supported Java values via compare(). If the associated Java type itself implements Comparable, then the two orderings do not necessarily have to agree, but they should if possible. In that case, NaturalSortAware.sortsNaturally() should return true.
  • null may or may not be a supported value; see supportsNull(). If so, it must be fully supported value just like any other; for example, it must be handled by compare() (typically null values sort last). Note that this is an additional requirement beyond what Comparator strictly requires.
  • There is a default value. For types that support null, the default value must be null, and for types that don't support null, obviously the default value must not be null; however, an exception can be made for encodings that don't support null but don't need default values, e.g., anonymous encodings that are always wrapped within a NullSafeEncoding.
  • All non-null values can be encoded/decoded into a String without losing information; see toString() and fromString().
  • All values, including null if supported, can be encoded/decoded into a self-delimiting binary string (i.e., byte[] array) without losing information. Moreover, these binary strings, when sorted lexicographically using unsigned comparison, sort consistently with the encoding's total ordering of the corresponding Java values; see read() and write().
  • An Encoding's string and binary encodings and sort ordering is guaranteed to never change, unless the EncodingId is also changed, which effectvely defines a new encoding. However, in such scenarios automatic schema migrations are easily handled by adding appropriate logic to convert().

Two Encoding instances should be equal according to equals() only when they behave identically with respect to all of the above.

Instances must be stateless (and therefore also thread safe).

See Also:
  • Field Details

    • MAX_ARRAY_DIMENSIONS

      static final int MAX_ARRAY_DIMENSIONS
      The maximum number of supported array dimensions (255).
      See Also:
  • Method Details

    • getEncodingId

      EncodingId getEncodingId()
      Get the globally unique encoding ID that identifies this encoding, if any.

      Once associated with a specific encoding, an encoding ID must never be changed or reused. If an Encoding's encoding changes in any way, then its encoding ID must also change. This applies only to the encoding itself, and not the associated Java type. For example, an Encoding's associated Java type can change over time, e.g., if the Java class changes package or class name.

      Returns:
      this encoding's unique ID, or null if this encoding is anonymous
    • getTypeToken

      TypeToken<T> getTypeToken()
      Get the Java type corresponding to this encoding's values.
      Returns:
      the Java type used to represent this encoding's values
    • read

      T read(ByteReader reader)
      Read a value from the given input.
      Parameters:
      reader - byte input
      Returns:
      decoded value (possibly null)
      Throws:
      IllegalArgumentException - if invalid input is encountered
      IndexOutOfBoundsException - if input is truncated
      IllegalArgumentException - if reader is null
    • write

      void write(ByteWriter writer, T value)
      Write a value to the given output.
      Parameters:
      writer - byte output
      value - value to write (possibly null)
      Throws:
      IllegalArgumentException - if value is null and this encoding does not support null
      IllegalArgumentException - if writer is null
    • getDefaultValueBytes

      default byte[] getDefaultValueBytes()
      Get the default value for this encoding encoded as a byte[] array.

      The implementation in Encoding returns the binary encoding of the value returned by getDefaultValue().

      Returns:
      encoded default value
      Throws:
      UnsupportedOperationException - if this encoding does not have a default value
    • getDefaultValue

      T getDefaultValue()
      Get the default value for this encoding.

      If this encoding supports null values, then this must return null.

      Returns:
      default value
      Throws:
      UnsupportedOperationException - if this encoding does not have a default value
    • skip

      void skip(ByteReader reader)
      Read and discard a byte[] encoded value from the given input.

      If the value skipped over is invalid, this method may, but is not required to, throw IllegalArgumentException.

      If the value skipped over is truncated, this method must throw IndexOutOfBoundsException.

      Parameters:
      reader - byte input
      Throws:
      IllegalArgumentException - if invalid input is encountered
      IndexOutOfBoundsException - if input is truncated
      IllegalArgumentException - if reader is null
    • toString

      String toString(T value)
      Encode a non-null value as a String for later decoding by fromString().

      Each of the characters in the returned String, when decoded as 32-bit Unicode codepoints, must contain only valid XML characters (see XMLUtil.isValidChar(int)).

      Parameters:
      value - actual value, never null
      Returns:
      string encoding of value acceptable to fromString()
      Throws:
      IllegalArgumentException - if value is null
      See Also:
    • fromString

      T fromString(String string)
      Parse a non-null value previously encoded by toString(T).
      Parameters:
      string - non-null value previously encoded as a String by toString(T)
      Returns:
      actual value
      Throws:
      IllegalArgumentException - if the input is invalid
      IllegalArgumentException - if string is null
    • convert

      default <S> T convert(Encoding<S> encoding, S value)
      Attempt to convert a value from the given Encoding into a value of this Encoding.

      For a non-null value, the implementation in Encoding first checks whether the value is already a valid value for this encoding; if so, the value is returned. Otherwise, it invokes encoding.toString(value) to convert value into a String, and then attempts to parse that string via this.fromString(); if the parse fails, an IllegalArgumentException is thrown. Note this means that any value will convert successfully to a String, as long as it doesn't contain an invalid escape sequence (see StringEncoding.toString(java.lang.String)).

      If value is null, the implementation in Encoding returns null, unless this encoding does not support null values, in which case an IllegalArgumentException is thrown.

      Permazen's built-in encodings include the following conversions:

      • Non-boolean Primitive types:
        • Convert from other non-boolean primitive types as if by the corresponding Java cast
        • Convert from boolean by converting to zero (if false) or one (if true)
      • Boolean: converts from other primitive types as if by value != 0
      • A char[] array and a String are convertible to each other
      • A char and a String of length one are convertible to each other (other Strings are not)
      • Arrays: converted by converting each array element individually (if possible)
      Type Parameters:
      S - source encoding
      Parameters:
      encoding - the Encoding of value
      value - the value to convert
      Returns:
      value converted to this instance's type
      Throws:
      IllegalArgumentException - if the conversion fails
    • validate

      default T validate(Object obj)
      Verify the given object is a valid instance of this Encoding's Java type and cast it to that type.

      Note that this method must throw IllegalArgumentException, not ClassCastException or NullPointerException, if obj does not have the correct type, or is an unsupported value - including null if null is not supported.

      This method is allowed to perform widening conversions of the object that lose no information, e.g., from Integer to Long.

      The implementation in Encoding first verifies the value is not null if this instance does not allow null values, and then attempts to cast the value using this instance's raw Java type. Subclasses should override this method to implement any other restrictions.

      Parameters:
      obj - object to validate
      Returns:
      obj cast to this encoding's type
      Throws:
      IllegalArgumentException - if obj in not of type T
      IllegalArgumentException - if obj is null and this encoding does not support null values
      IllegalArgumentException - if obj is in any other way not supported by this Encoding
    • compare

      int compare(T value1, T value2)
      Order two values.

      This method must provide a total ordering of all supported Java values that is consistent with the database ordering, i.e., the unsigned lexicographical ordering of the corresponding byte[] encoded values.

      If null is a supported Java value, then the this method must accept null parameters without throwing an exception (note, this is a stronger requirement than the Comparator interface normally requires).

      Note: by convention, null values usually sort last.

      Specified by:
      compare in interface Comparator<T>
      Throws:
      IllegalArgumentException - if value1 or value2 is null and this encoding does not support null
    • supportsNull

      boolean supportsNull()
      Determine whether this encoding supports null values.
      Returns:
      true if null is a valid value, otherwise false
    • hasPrefix0x00

      boolean hasPrefix0x00()
      Determine whether any of this encoding's encoded values start with a 0x00 byte. Certain optimizations are possible when this is not the case. It is safe for this method to always return true.

      Note: changing the result of this method may result in an incompatible encoding if this encoding is wrapped in another class.

      Returns:
      true if an encoded value starting with 0x00 exists
    • hasPrefix0xff

      boolean hasPrefix0xff()
      Determine whether any of this encoding's encoded values start with a 0xff byte. Certain optimizations are possible when this is not the case. It is safe for this method to always return true.

      Note: changing the result of this method may result in an incompatible encoding if this encoding is wrapped in another class.

      Returns:
      true if an encoded value starting with 0xff exists
    • getFixedWidth

      OptionalInt getFixedWidth()
      Get the fixed width of this encoding, if any.

      Some encodings encode every value into the same number of bytes. For such encodings, this method returns that number. For variable width encodings, this method must return empty.

      Returns:
      the number of bytes of every encoded value, or empty if the encoding length varies
    • validateAndWrite

      default void validateAndWrite(ByteWriter writer, Object obj)
      Convenience method that both validates and encodes a value.

      Equivalent to:

       this.write(writer, this.validate(obj))
       
      Parameters:
      writer - byte output
      obj - object to validate
      Throws:
      IllegalArgumentException - if obj in not of type T
      IllegalArgumentException - if obj is null and this encoding does not support null values
      IllegalArgumentException - if obj is in any other way not supported by this Encoding
      IllegalArgumentException - if writer is null
    • getKeyRange

      default KeyRange getKeyRange(Bounds<? extends T> bounds)
      Calculate the KeyRange that includes exactly those encoded values that lie within the given bounds.
      Parameters:
      bounds - bounds to impose
      Returns:
      KeyRange corresponding to bounds
      Throws:
      IllegalArgumentException - if bounds is null
    • encode

      default byte[] encode(T value)
      Encode the given value into a byte[] array.

      The implementation in Encoding creates a temporary ByteWriter and then delegates to write().

      Parameters:
      value - value to encode, possibly null
      Returns:
      encoded value
      Throws:
      IllegalArgumentException - if obj is invalid
    • decode

      default T decode(byte[] bytes)
      Decode a valid from the given byte[] array.

      The implementation in Encoding creates a temporary ByteReader and then delegates to read().

      Parameters:
      bytes - encoded value
      Returns:
      decoded value, possibly null
      Throws:
      IllegalArgumentException - if bytes is null, invalid, or contains trailing garbage