2 Replies Latest reply on Dec 2, 2013 9:26 AM by nl

    TikaTextExtractor and excludedMimeTypes

    nl Newbie



      when I use the attribute excludedMimeTypes in my tika configuration (in repository.json) I'll get a ClassCastException in TikaTextExtractor.java:


      java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.tika.mime.MediaType

          at org.modeshape.extractor.tika.TikaTextExtractor.supportsMimeType(TikaTextExtractor.java:119)


      I assume that it is because of settings field by reflection and the reflection util does not take care of generics properly?!?


          public boolean supportsMimeType( String mimeType ) {
              MediaType mediaType = MediaType.parse(mimeType);
              if (mediaType == null) {
                  getLogger().debug("Invalid mime-type:" + mimeType);
                  return false;
              for (MediaType excludedMediaType : excludedMimeTypes) { // !!!expects a MediaType but gets a String instead!!!
                  if (excludedMediaType.equals(mediaType)) {
                      return false;
                  if (excludedMediaType.getSubtype().equalsIgnoreCase("*") && mediaType.getType().equalsIgnoreCase(excludedMediaType.getType())) {
                      return false;
              return includedMimeTypes.isEmpty() ? supportedMediaTypes.contains(mimeType) : supportedMediaTypes.contains(mimeType)
                                                                                            && includedMimeTypes.contains(mimeType);


      Any advice?


      Thanks, Niels