Pangool User Guide

Custom Comparators

You can use your own custom RawComparator for comparing a field in a Tuple. This can be specified in the sort by configuration, for example:

 cg.setOrderBy(new SortBy().add("word", Order.DESC, new MyUtf8Comparator()));

Example:

 private static class MyUtf8Comparator implements RawComparator<Text>, Serializable {
   public int compare(Text arg0, Text arg1) {
     throw new NotImplementedException();
   }

   public int compare(byte[] buf1, int off1, int len1, byte[] buf2, int off2, int len2) {
     if(len1 > len2) {
       return 1;
     } else if(len1 < len2) {
       return -1;
     } else {
       return 0;
     }
   }
 }

In case that you need to deserialize your object in order to compare it, you can subclass BaseComparator and then just compare objects.
For instance, let's define a Thrift object whose definition is :

 struct MyThriftObject {
   1: string userName,
   2: int age
 }

Then your custom comparator could look like this:

 public class MyThriftComparator extends BaseComparator<MyThriftObject> {

   public MyThriftComparator(){
     super(Type.OBJECT,MyThriftObject.class);   
   }
  
   public int compare(MyThriftObject a, MyThriftObject b){
     return a.getUserName().compareTo(b.getUserName());
   }
 }
Important: For cases like above, where just one field needs to be compared, we don't recommend using custom comparators.
Due to efficiency concerns it's more desirable to duplicate fields out of the serialized object and then perform the comparison using basic types.

Next: Text Input/Output »