We are using DataStax Enterprise trial. A system which was working nicely started to intermittently stop working / get timeout exceptions. This has gone on for several days without resolving itself.
Stack Trace:
Exception:
me.prettyprint.hector.api.exceptions.HTimedOutException: TimedOutException()
at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:42)
at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:260)
at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:113)
at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:243)
I read that a timeout is sometimes a "in progress" exception so i planned to trace my query and see if it was slow, however after restarting a single node in the cluster it started working again perfectly so it does not look like it was related to my query.
If a single node was providing timeouts shouldn't the forwarding nodes stop sending requests to that box? The replication factor was set to 1 so maybe they had no alternative?
There are some messages in the node logs:
I get this a lot:
HintedHandOffManager.java (line 374) Timed out replaying hints to /10.33.175.144; aborting further deliveries
And this:
CustomTThreadPoolServer.java (line 210) Error occurred during processing of message.
java.lang.RuntimeException: Failed to open server transport: unknown
at com.datastax.bdp.transport.server.TNegotiatingServerTransport$Factory.getTransport(TNegotiatingServerTransport.java:288)
at com.datastax.bdp.transport.server.TNegotiatingServerTransport$Factory.getTransport(TNegotiatingServerTransport.java:260)
at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:184)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.thrift.transport.TTransportException: java.net.SocketException: Connection reset
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)
at com.datastax.bdp.transport.server.TPreviewableTransport.readUntilEof(TPreviewableTransport.java:79)
at com.datastax.bdp.transport.server.TPreviewableTransport.preview(TPreviewableTransport.java:55)
at com.datastax.bdp.transport.server.TNegotiatingServerTransport.open(TNegotiatingServerTransport.java:169)
at com.datastax.bdp.transport.server.TNegotiatingServerTransport$Factory.getTransport(TNegotiatingServerTransport.java:281)
... 5 more
Any ideas?
Thanks for any help