Quantcast
Channel: DataStax Support Forums » Recent Topics
Viewing all 387 articles
Browse latest View live

hwangj on "Adding new node failed in Datastax Enterprise 3.0"

$
0
0

Hi,
I was able to install Datastax Enterprise 3.0 and add 3 notes successfully when I first did it around April 8th. When I tried to add another new cassandra node recently, I am getting "Install Errored: Unable to dry-run package installation". Close to the bottom of the message window, It stated "dse: Depends: dse-libhadoop-native(= 3.0-1) but it is not going to be installed. Recommends:sun-java6-jre(>=6.24-1~squeeze1) but it is not installable"

Why am I getting the error which I did not get when I did my first installation. I do have java version "1.6.0_43" installed.


sabhub on "Accessing S3 from local Datastax Enterprise PIG installation on my Mac: DSE 2.2.1"

$
0
0

Hello,

I am trying to read a file on S3 using the PIG script. I have added the s3 id and keys to the hadoop_site.xml file. WHen I tried to execute the query I get an error saying . I am trying to build a POC for my project. Any help appreciated.
The files are in gz.gpg format on s3.

Error:

A MAP_ONLY Message: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Path must be absolute: s3://my.bucket.img
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:282)
at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:962)
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:979)
at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:174)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:897)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)
at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)
at org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
at org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
at java.lang.Thread.run(Thread.java:680)
Caused by: java.lang.IllegalArgumentException: Path must be absolute: s3://chegg.edw.ereader.incoming
at org.apache.hadoop.fs.s3.Jets3tFileSystemStore.pathToKey(Jets3tFileSystemStore.java:325)
at org.apache.hadoop.fs.s3.Jets3tFileSystemStore.retrieveINode(Jets3tFileSystemStore.java:195)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
at $Proxy7.retrieveINode(Unknown Source)
at org.apache.hadoop.fs.s3.S3FileSystem.getFileStatus(S3FileSystem.java:332)
at org.apache.hadoop.fs.FileSystem.getFileStatus(FileSystem.java:1337)
at org.apache.hadoop.fs.FileSystem.globStatusInternal(FileSystem.java:1008)
at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:987)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:215)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigTextInputFormat.listStatus(PigTextInputFormat.java:36)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:252)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:270)
... 14 more
cfs:/tmp/temp370734180/tmp2140428011,

I appreciate any help.

kkiran1984 on "HUnavailableException:May not be enough replicas present to handle consistency level."

$
0
0

Hi all,

I am using Cassandra 1.2.4 on my machine (windows 7).

I have got 3 nodes in my DC1, which is on my machine. I am only using one DC (my machine). I did keep replication factor=2 and "HConsistencyLevel.ONE". However, when one of the nodes is down, and I attempt to read or write to DB, I get the error - "May not be enough replicas present to handle consistency level.".
I am under the impression that when we keep the consistency level as "ONE" and even if one node is up, the write/read should happen. But I get this error. Could someone correct me as what I was doing wrong here. I want to make the read/write happen in the event of node failure. Below is my code.
~
String keySpaceName="kspace";
String clusterName="Test Cluster";
String columFamilyName="ktable";
String host="127.0.0.1:9160,127.0.0.2:9161,127.0.0.3:9162";
int replicationFactor=2;

CassandraHostConfigurator cassandraHostConfigurator = new CassandraHostConfigurator(host);
Cluster cluster = HFactory.getOrCreateCluster(clusterName,cassandraHostConfigurator);
KeyspaceDefinition keyspaceDef = cluster.describeKeyspace(keySpaceName);

ConfigurableConsistencyLevel configurableConsistencyLevel = new ConfigurableConsistencyLevel();
Map<String, HConsistencyLevel> clmap = new HashMap<String, HConsistencyLevel>();

// Define CL.ONE for ColumnFamily "ktable"
clmap.put(columFamilyName, HConsistencyLevel.ONE);
configurableConsistencyLevel.setReadCfConsistencyLevels(clmap);
configurableConsistencyLevel.setWriteCfConsistencyLevels(clmap);

if(keyspaceDef==null) // Create the keyspace and add it to the cluster if keyspace does not exist
{
KeyspaceDefinition newKeyspace = HFactory.createKeyspaceDefinition(
keySpaceName, ThriftKsDef.DEF_STRATEGY_CLASS,replicationFactor, null);
cluster.addKeyspace(newKeyspace, true);
}
Keyspace keyspace = HFactory.createKeyspace(keySpaceName, cluster, configurableConsistencyLevel);

StringSerializer ss = StringSerializer.get();
ColumnFamilyTemplate<String, String> cft = new ThriftColumnFamilyTemplate<String, String>(keyspace, columFamilyName, ss, ss);
ColumnFamilyUpdater<String, String> updater = cft.createUpdater("xkey");
UUID uid = new UUID();
updater.setValue("id",Long.toString(uid.getClockSeqAndNode()),ss);
updater.setValue("name", "Catherine", ss);
updater.setValue("state", "GA", ss);
cft.update(updater);
~
Regards,
Kiran.

prateek@bloomreach.com on "What cassandra client to use for haoop integration?"

$
0
0

Hi All,

I am trying to build a data services layer using cassandra as the backend store. I am new to Cassandra and not sure what client to use for cassandra - thrift or cql 3? We have a lot of mapreduce jobs using Amazon elastic mapreduce (EMR) that will be reading/ writing the data from cassandra at high volume. The total data volume will be > 100 TB with billions of rows in Cassandra. The mapreduce jobs may be read or write heavy with high qps (>1000 qps). The requirements are as follows:

* Simplicity of client code. It seems thrift has in-built integration with Hadoop for bulk data loading using sstableloader (http://www.datastax.com/dev/blog/bulk-loading).
* Ability to define new columns at run time. We may need to add more columns depending on application requirements. It seems cql3 does not allow definition of columns dynamically at runtime.
* Performance of bulk read/ write. Not sure which client is better. However, I found this post that claims thrift client has better performance for high data volume: http://jira.pentaho.com/browse/PDI-7610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

I could not find any authoritative source of information that answers this question. Appreciate if you could help with this since I am sure this is a common problem for most folks and would benefit the overall community.

Many thanks in advance.

-Prateek

raihan26 on "Datastax API- CQL Binary protocol example"

$
0
0

I have started working with Cassandra database. I am planning to use Datastax API to upsert/read into/from cassandra database. I am totally new to this Datastax API (which uses new Binary protocol) and I am not able to find lot of documentations as well which have some proper examples.

When I was working with Cassandra CLI using the Netflix client(Astyanax client), then I created the column family like this-

create column family profile
with key_validation_class = 'UTF8Type'
and comparator = 'UTF8Type'
and default_validation_class = 'UTF8Type'
and column_metadata = [
  {column_name : crd, validation_class : 'DateType'}
  {column_name : lmd, validation_class : 'DateType'}
  {column_name : account, validation_class : 'UTF8Type'}
  {column_name : advertising, validation_class : 'UTF8Type'}
  {column_name : behavior, validation_class : 'UTF8Type'}
  {column_name : info, validation_class : 'UTF8Type'}
  ];

Now I was trying to do the same thing using Datastax API. So to start working with Datastax API, do I need to create the column family in some different way as mentioned above? Or the above column familiy will work fine whenever I will try to insert data into Cassandra database using Datastax API.

If the above column family will not work then-

First of all I have created the KEYSPACE like below-

CREATE KEYSPACE USERS WITH strategy_class = 'SimpleStrategy' AND strategy_options:replication_factor = '1';

Now I am confuse how to create the table? I am not sure which is the right way to do that?

Should I create like this?

CREATE TABLE profile (
id varchar,
account varchar,
advertising varchar,
behavior varchar,
info varchar,
PRIMARY KEY (id)
);

or should I create like this?

CREATE COLUMN FAMILY profile (
id varchar,
account varchar,
advertising varchar,
behavior varchar,
info varchar,
PRIMARY KEY (id)
);

And also how to add-

crd as DateType
lmd as DateType

in above table or column family while working with Datastax API?

Any help will be appreciated.

Joel Belog on "CQLSH with Kerberos issue - missing files?"

$
0
0

Hello, I'm trying to follow the documentation on how to enable cqlsh to use kerberros. I'm find that the headerfiles for the source code for pyKerberos is missing from the file provided by DataStax. Has anyone else run into this, how did you resolve it?

Thanks.

andy on "newbie: trouble with cql containers and secondary index on composite keys"

$
0
0

Hi

I am new to cassandra datastax enterprise. I am having a lot of trouble with the sample music app in the documentation.

http://www.datastax.com/docs/1.2/ddl/table#compound-keys-and-clustering

I am not able to create an index on the playlists table.

Here is how I start cql on my mac

$ ~/cassandra/dataStax/dse-3.0.1/bin/cqlsh --version
cqlsh 2.2.0
$

dataStax/dse-3.0.1/bin/cqlsh --cql3

Also the container types do not seem to work.

cqlsh:aedwip> ALTER TABLE songs ADD tags set<text>;
Bad Request: line 1:27 no viable alternative at input 'set'

Any idea what I am doing wrong?

thanks in advance

Andy

CREATE TABLE playlists (
id uuid,
song_id uuid,
album text,
artist text,
title text,
PRIMARY KEY (id, song_id)
) WITH
comment='' AND
caching='KEYS_ONLY' AND
read_repair_chance=0.100000 AND
gc_grace_seconds=864000 AND
replicate_on_write='true' AND
compaction_strategy_class='SizeTieredCompactionStrategy' AND
compression_parameters:sstable_compression='SnappyCompressor';

cqlsh:aedwip> CREATE INDEX ON playlists(artist);
Bad Request: Secondary indexes are not (yet) supported on tables with composite PRIMARY KEY
Perhaps you meant to use CQL 2? Try using the -2 option when starting cqlsh.
cqlsh:aedwip>

Geetanjali on "Issue with date format in cassandra and Solr"

$
0
0

I am using DSE3.0.1 on 2 node cluster. one node is for cassandra daemon and other is for Solr
I am trying to use dateformat in yyyy-mm-dd.

For this i have created a field in cassandra table as timestamp and in schema.xml map it to type solr.TrieDateField
eg. i added the values as 2013-12-13 in cassandra.
cassandra table shows it like: 2013-12-13:00:00:00+0530 and on solr UI shows it like 2013-12-13:T18:30:00z

Problem is why Solr is 18:30:00z in time it rather should display 00:00:00z

How shall i resolve this.

Thanks
Geetanjali


avon on "Error creating Solr indexes on CQL3 created tables"

$
0
0

Hi,

I created a table using CQL3 and subsequently went ahead to create a Solr index and encountered this error when curl-ing the solrconfig.xml.

CREATE TABLE receiptrepository (
CustomerId uuid,
ReceiptDate timestamp,
ReceiptId uuid,
MerchantName varchar,
MallName varchar,
PRIMARY KEY (CustomerId, ReceiptDate, ReceiptId, MerchantName, MallName)
);

curl --data-binary file=@solrconfig.xml -H 'Content-type:text/xml; charset=utf-8' http://192.168.56.135:8983/solr/resource/demodb.receiptrepository/solrconfig.xml (btw, solrconfig.xml is reused from wiki-solr demo)

error I get: "Solr indexes are not supported on ColumnFamilies with non-string comparators"

1. Could you explain what is wrong with the TABLE creation to elicit this response during solr index creation?
2. Also on a side note: when I executed the wiki-solr demo and launched the solr admin, I'm unable to see the full query interface and wiki-solr doesn't appear on the left menu option (below thread dump)

matt.lieber on "DSE 3.0 with Solr giving 404 error"

$
0
0

I had Datastax DSE 2.2 with Solr, in which i ran the Wikipedia demo. Everything worked fine. I then upgraded to DSE 3.0, and rebuilt the Solr index. Now everytime i hit the URL :

http://myserverURL:8983/solr/wiki.solr/admin/
I get a 404 'The requested resource (/solr/wiki.solr/admin/) is not available.' from Tomcat.

I looked in /var/log/cassandra/system.log and there is no error. I found this resource: http://www.datastax.com/docs/datastax_enterprise3.0/upgrade/solr_upgrade I ran the scripts in there; everything again runs fine for me, but doesnt solve my error. Any idea ? Have the resource file locations for Tomcat have moved in DSE 3.0 ?

Cheers, Matt

datastaxnewbie on "Cassandra to Solr Mapping (Fat rows)"

$
0
0

Hi,

I am new to DSE/cassandra/solr and hope someone can help me with this problem. How do I create a solr schema to map to a CF that has small number of rows (hourly timestamp as rowkey) and variable/large number of columns (timestamp uuid as column name/key)? The purpose of this CF is to store a large series of events that occur and stuff all events within each hour within a row. I assume that I need to know the column names ahead of time in order to create a matching solr schema? If so, how can I create a solr schema to map to a CF that I don't have column names for until an event is inserted? The wiki demo shows the solr schema mapping to columns _docBoost, body, and date but they are known at design time.

Thanks in advance,

Paul

bigd on "Warnings when running hadoop commands on MacOS X"

$
0
0

I am new to datastax - just downloaded it and started playing with the server.

My environment:
Mac OS X (10.8.2)

I am able to run the examples using hadoop, cassandra and pig. However, I get this warning every time I run any "dse hadoop xxx" command. The command does run aok but wanted to see what this warning is all about:

13/05/04 21:24:54 WARN util.HadoopProcessCheck: Failed to run native child wrapper: java.lang.UnsatisfiedLinkError: org.apache.hadoop.util.HadoopProcess.findChildWrapper()Ljava/lang/String;
java.lang.UnsatisfiedLinkError: org.apache.hadoop.util.HadoopProcess.findChildWrapper()Ljava/lang/String;
at org.apache.hadoop.util.HadoopProcess.findChildWrapper(Native Method)
at org.apache.hadoop.util.HadoopProcessCheck.findChildWrapper(HadoopProcessCheck.java:35)
at org.apache.hadoop.util.HadoopProcessCheck.tryRunChildWrapper(HadoopProcessCheck.java:49)
at org.apache.hadoop.util.HadoopProcessCheck.isHadoopProcessUsable(HadoopProcessCheck.java:26)
at org.apache.hadoop.util.Shell.startProcess(Shell.java:188)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:225)
at org.apache.hadoop.util.Shell.run(Shell.java:182)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:401)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:487)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:470)
at org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getUnixGroups(ShellBasedUnixGroupsMapping.java:68)
at org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getGroups(ShellBasedUnixGroupsMapping.java:45)
at org.apache.hadoop.security.Groups.getGroups(Groups.java:79)
at org.apache.hadoop.security.UserGroupInformation.getGroupNames(UserGroupInformation.java:1026)
at com.datastax.bdp.hadoop.cfs.CassandraFileSystemThriftStore.checkPermissions(CassandraFileSystemThriftStore.java:1394)
at com.datastax.bdp.hadoop.cfs.CassandraFileSystemThriftStore.checkPermissions(CassandraFileSystemThriftStore.java:1422)
at com.datastax.bdp.hadoop.cfs.CassandraFileSystem.listStatus(CassandraFileSystem.java:210)
at org.apache.hadoop.fs.FsShell.shellListStatus(FsShell.java:1173)
at org.apache.hadoop.fs.FsShell.ls(FsShell.java:593)
at org.apache.hadoop.fs.FsShell.ls(FsShell.java:582)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:1791)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:1895)

jbdkz on "Opscenter Community Error"

$
0
0

Running Opscenter v1.4.1 on RHEL64. I modified the /etc/opscenter/opscenterd.conf file to include the interface of the web server and the the IP address of the seed Cassandra node.

I have also manually installed the Opscenter Agent on the seed Cassandra node.

When I open up the Opscenter web page in Firefox 20.0.1 on a Windows 2003 server, I get the following message:

Error Loading OpsCenter Error:a[0] is undefined. This is my first forage into the Cassandra/Opscenter realm so it could be something silly.

matt.lieber on "Using Solr indexing and Cassandra"

$
0
0

hi,

If i follow the Datastax tutorial about Solr, it says to :
- Create a Column family in C*; insert data into it
- Then create a schema in Solr that references this CF, with parameters for each field as:
indexed="true" stored="true"

So my question is, in that case isn't the data stored *twice*, once in C* (thru insertions) and in Solr (via 'stored='true') ? If so, isn't this inefficient ? Or am I missing something ?

thanks,
Matt

IanRogers on "cqlsh "copy to" fails when there are more than 22 columns"

$
0
0

See the cqlsh script below. If you comment out the "c1" line then the "copy to" command exports 0 rows even though the "select *" works fine!


ian@ian: cqlsh --version
cqlsh 2.2.0
ian@ian: nodetool -h localhost version
ReleaseVersion: 1.1.9

USE test_keyspace;

DROP TABLE foo;

CREATE TABLE foo (
id varchar PRIMARY KEY,
a0 varchar,
a1 varchar,
a2 varchar,
a3 varchar,
a4 varchar,
a5 varchar,
a6 varchar,
a7 varchar,
a8 varchar,
a9 varchar,
b0 varchar,
b1 varchar,
b2 varchar,
b3 varchar,
b4 varchar,
b5 varchar,
b6 varchar,
b7 varchar,
b8 varchar,
b9 varchar,
c0 varchar,
-- c1 varchar, -- uncomment this line and the "copy" will fail even though the "select" works fine!
);

insert into foo (id, a1) values ('foo', 'grum');

select * from foo;

copy foo to '/tmp/foo.csv';


upant on "sstableloader error"

$
0
0

Hi,

I am running DSE 3.0 in CentOS 6.x. I followed installation instructions at http://www.datastax.com/docs/datastax_enterprise3.0/install/install_rpm_pkg. I can create column families, insert data and retrieve data using CLI. However, whenI tried to run sstableloader (http://www.datastax.com/dev/blog/bulk-loading), I got following error:

Error instantiating snitch class 'com.datastax.bdp.snitch.DseDelegateSnitch'.
Fatal configuration error; unable to start server.

I tried the same code in Apache Cassandra (not DataStax distribution) and I was able to run the loader without any issue or any additional configuration.

Has anybody tried sstableloader in DSE?

Thanks,
Uddhab

rbalcon on "Managing AWS AMI Instances"

$
0
0

All,

We are trying to setup a set of datastax test servers running DSE using AMI. Is there a way to auto scale the instances to allow us to start and stop all the servers when we want to? What we want to achieve is the ability to manage our costs by stopping the cluster when we are not using it? Is there another way to achieve the same result?

Will this cause issues? Has anyone else tried to create such a setup that they can share?

Thanks in advance,

Richard

jbdkz on "iptables rules for DataStax Enterprise"

$
0
0

I am running three DataStax Enterprise Cassandra Nodes on RHEL6.4 in a VMWare workstation test environment. On each node I have the iptables service stopped. I want to have iptables running on each node but am not sure how to properly configure iptables rules for this. I am aware of the link on the DataStax site indicating the ports that need to be opened.

http://www.datastax.com/docs/1.2/security/firewall_ref#firewall-ref

Is there any guide on how to properly configure iptables for use with Datstax Enterprise?

Geetanjali on "Use DSE 3.0.1 with any perl client"

$
0
0

Hi

I wanted to use cassandra[1.1.9.3] which comes bundled with DSE3.0.1 with any perl client. I tried using perlcassa but perlcassa is not compatible with cassandra 1.1.9.3.

Thanks,
Geetanjali

amey on "EC2 AMI 2.4 Issue"

$
0
0

I am creating a 3 node Datastax community edition cluster using the guidelines given at http://www.datastax.com/docs/1.2/install/install_ami using EC2 C1.Xlarge machines.
Providing the following parameters :
--clustername DevCassandra--totalnodes 3 --version community

The cluster is creating is completed, but the issue seems to be that it is creating 3 different clusters with one node each. Also, I am running nodetool status on every cluster and it just shows the same one node in the output.
Opscenter is also installed on ALL the nodes, which accoring to instructions should have been installed only on one node.
Have repeated this process three times, any comments/suggestions on what I might be doing wrong ?

Thanks,
Amey

Viewing all 387 articles
Browse latest View live