* Licensed under the Apache License, Version 2.0 (the "License"); you
* may not use this file except in compliance with the License. You
* may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
* implied. See the License for the specific language governing
* permissions and limitations under the License. See accompanying
* LICENSE file.
*/
```
When modifying files that already have a license header, please update the year when you made your edits. E.g. change ``Copyright (c) 2010 Yahoo! Inc., 2012 - 2016 YCSB contributors.`` to ``Copyright (c) 2010 Yahoo! Inc., 2012 - 2017 YCSB contributors.`` If the file only has ``Copyright (c) 2010 Yahoo! Inc.``, append the current year as in ``Copyright (c) 2010 Yahoo! Inc., 2017 YCSB contributors.``.
**WARNING**: It should go without saying, but don't copy and paste code from outside authors or sources. If you are a database author and want to copy some example code, it must be APL2 compatible.
Client bindings to non-APL databases are perfectly acceptable, as data stores are meant to be used from all kinds of projects. Just make sure not to copy any code or commit libraries or binaries into the YCSB code base. Link to them in the Maven pom file.
## Issues and Support
To track bugs, feature requests and releases we use GitHub's integrated [Issues](https://github.com/brianfrankcooper/YCSB/issues). If you find a bug or problem, open an issue with a descriptive title and as many details as you can give us in the body (stack traces, log files, etc). Then if you can create a fix, follow the PR guidelines below.
**Note** Before embarking on a code change or DB, search through the existing issues and pull requests to see if anyone is already working on it. Reach out to them if so.
For general support, please use the mailing list hosted (of course) with Yahoo groups at [http://groups.yahoo.com/group/ycsb-users](http://groups.yahoo.com/group/ycsb-users).
## Code Style
A Java coding style guide is enforced via the Maven CheckStyle plugin. We try not to be too draconian with enforcement but the biggies include:
* Whitespaces instead of tabs.
* Proper Javadocs for methods and classes.
* Camel case member names.
* Upper camel case classes and method names.
* Line length.
CheckStyle will run for pull requests or if you create a package locally so if you just compile and push a commit, you may be surprised when the build fails with a style issue. Just execute ``mvn checkstyle:checkstyle `` before you open a PR and you should avoid any suprises.
## Platforms
Since most data bases aim to support multiple platforms, YCSB aims to run on as many as possible as well. Besides **Linux** and **macOS**, YCSB must compile and run for **Windows**. While not all DBs will run under every platform, the YCSB tool itself must be able to execute on all of these systems and hopefully be able to communicate with remote data stores.
Additionally, YCSB is targeting Java 7 (1.7.0) as its build version as some users are glacially slow moving to Java 8. So please avoid those Lambdas and Streams for now.
## Pull Requests
You've written some amazing code and are excited to share it with the community! It's time to open a PR! Here's what you should do.
* Checkout YCSB's ``master`` branch in your own fork and create a new branch based off of it with a name that is reflective of your work. E.g. ``i123`` for fixing an issue or ``db_xyz`` when working on a binding.
* Add your changes to the branch.
* Commit the code and start the commit message with the component you are working on in square braces. E.g. ``[core] Add another format for exporting histograms.`` or ``[hbase12] Fix interrupted exception bug.``.
* Push to your fork and click the ``Create Pull Request`` button.
* Wait for the build to complete in the CI pipeline. If it fails with a red X, click through the logs for details and fix any issues and commit your changes.
* If you have made changes, please flatten the commits so that the commit logs are nice and clean. Just run a ``git rebase -i ``.
After you have opened your PR, a YCSB maintainer will review it and offer constructive feedback via the GitHub review feature. If no one has responded to your PR, please bump the thread by adding comments.
**NOTE**: For maintainers, please get another maintainer to sign off on your changes before merging a PR. And if you're writing code, please do create a PR from your fork, don't just push code directly to the master branch.
## Core, Bindings and Workloads
The main components of the code base include the core library and benchmarking utility, various database client bindings and workload classes and definitions.
### Core
When working on the core classes, keep in mind the following:
* Do not change the core behavior or operation of the main benchmarking classes (Particularly the Client and Workload classes). YCSB is used all over the place because it's a consistent standard that allows different users to compare results with the same workloads. If you find a way to drastically improve throughput, that's great! But please check with the rest of the maintainers to see if we can add the tweaks without invalidating years of benchmarks.
* Do not remove or modify measurements. Users may have tooling to parse the outputs so if you take something out, they'll be a wee bit unhappy. Extending or adding measurements is fine (so if you do have tooling, expect additions.)
* Do not modify existing generators. Again we don't want to invalidate years of benchmarks. Instead, create a new generator or option that can be enabled explicitly (not implicitly!) for users to try out.
* Utility classes and methods are welcome. But if they're only ever used by a specific database binding, co-locate the code with that binding.
* Don't change the DB interface if at all possible. Implementations can squeeze all kinds of workloads through the existing interface and while it may be easy to change the bindings included with the source code, some users may have private clients they can't share with the community.
### Bindings and Clients
When a new database is released a *binding* can be created that implements a client communicating with the given data store that will execute YCSB workloads. Details about writing a DB binding can be found on our [GitHub Wiki page](https://github.com/brianfrankcooper/YCSB/wiki/Adding-a-Database). Some development guidelines to follow include:
* Create a new Maven module for your binding. Follow the existing bindings as examples.
* The module *must* include a README.md file with details such as:
* Database setup with links to documentation so that the YCSB benchmarks will execute properly.
* Example command line executions (workload selection, etc).
* Required and optional properties (e.g. connection strings, behavior settings, etc) along with the default values.
* Versions of the database the binding supports.
* Javadoc the binding and all of the methods. Tell us what it does and how it works.
Because YCSB is a utility to compare multiple data stores, we need each binding to behave similarly by default. That means each data store should enforce the strictest consistency guarantees available and avoid client side buffering or optimizations. This allows users to evaluate different DBs with a common baseline and tough standards.
However you *should* include parameters to tune and improve performance as much as possible to reach those flashy marketing numbers. Just be honest and document what the settings do and what trade-offs are made. (e.g. client side buffering reduces I/O but a crash can lead to data loss).
### Workloads
YCSB began comparing various key/value data stores with simple CRUD operations. However as DBs have become more specialized we've added more workloads for various tasks and would love to have more in the future. Keep the following in mind:
* Make sure more than one publicly available database can handle your workload. It's no fun if only one player is in the game.
* Use the existing DB interface to pass your data around. If you really need another API, discuss with the maintainers to see if there isn't a workaround.
* Provide real-world use cases for the workload, not just theoretical idealizations.
================================================
FILE: LICENSE.txt
================================================
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
================================================
FILE: NOTICE.txt
================================================
=========================================================================
NOTICE file for use with, and corresponding to Section 4 of,
the Apache License, Version 2.0,
in this case for the YCSB project.
=========================================================================
This product includes software developed by
Yahoo! Inc. (www.yahoo.com)
Copyright (c) 2010 Yahoo! Inc. All rights reserved.
This product includes software developed by
Google Inc. (www.google.com)
Copyright (c) 2015 Google Inc. All rights reserved.
================================================
FILE: README.md
================================================
# ByteIterator
从数据库取数据,我使用了ByteIterator数据接口,其例子和好处如下。
代码示例:
```Java
@Override
public Status read(String table, String key, Set fields, Map result) {
try {
byte[] value = db.get(key.getBytes());
Map deserialized = deserialize(value);
result.putAll(deserialized);
} catch (RocksDBException e) {
System.out.format("[ERROR] caught the unexpceted exception -- %s\n", e);
return Status.ERROR;
}
return Status.OK;
}
```
Why I use ByteIterator here?
a.出于性能考虑,主要考虑字符串的成本、拷贝转码问题,流可能是一个图片(blob形式)
b.byte是字节,可以屏蔽utf8、gbk等编码细节。文本从磁盘拿出来本来是二进制,需要通过编码转化为对应的字符。ByteIterator可以屏蔽不同服务器编码不一样的的问题。
使用Byte来存储数据有什么缺点和优点?
略
使用Iterator有什么缺点和优点?可以屏蔽细节?
略
使用ByteIerator来有什么缺点和有点?
略
Leveldb and Rocksdb modules of YCSB
====================================
[](https://travis-ci.org/brianfrankcooper/YCSB)
Links
-----
http://wiki.github.com/brianfrankcooper/YCSB/
https://labs.yahoo.com/news/yahoo-cloud-serving-benchmark/
ycsb-users@yahoogroups.com
Getting Started
---------------
1. Download the [latest release of YCSB](https://github.com/brianfrankcooper/YCSB/releases/latest):
```sh
curl -O --location https://github.com/brianfrankcooper/YCSB/releases/download/0.12.0/ycsb-0.12.0.tar.gz
tar xfvz ycsb-0.12.0.tar.gz
cd ycsb-0.12.0
```
2. Set up a database to benchmark. There is a README file under each binding
directory.
3. Run YCSB command.
On Linux:
```sh
bin/ycsb.sh load basic -P workloads/workloada
bin/ycsb.sh run basic -P workloads/workloada
```
On Windows:
```bat
bin/ycsb.bat load basic -P workloads\workloada
bin/ycsb.bat run basic -P workloads\workloada
```
Running the `ycsb` command without any argument will print the usage.
See https://github.com/brianfrankcooper/YCSB/wiki/Running-a-Workload
for a detailed documentation on how to run a workload.
See https://github.com/brianfrankcooper/YCSB/wiki/Core-Properties for
the list of available workload properties.
Building from source
--------------------
YCSB requires the use of Maven 3; if you use Maven 2, you may see [errors
such as these](https://github.com/brianfrankcooper/YCSB/issues/406).
To build the full distribution, with all database bindings:
mvn clean package
To build a single database binding:
mvn -pl com.yahoo.ycsb:mongodb-binding -am clean package
================================================
FILE: Todo.md
================================================
@Override
public Status read(String table, String key, Set fields, Map result) {
try {
byte[] value = db.get(key.getBytes());
Map deserialized = deserialize(value);
result.putAll(deserialized);
} catch (RocksDBException e) {
System.out.format("[ERROR] caught the unexpceted exception -- %s\n", e);
return Status.ERROR;
}
return Status.OK;
}
/**
1. Why I use ByteIterator here?
a.出于性能考虑,主要考虑字符串的成本、拷贝转码问题,流可能是一个图片(blob形式)
b.byte是字节可以,可以屏蔽utf8、gbk等编码细节。文本从磁盘拿出来本来是二进制,需要通过编码转化为对应的字符。
ByteIterator可以屏蔽不同服务器编码不一样的的问题。
**/
================================================
FILE: accumulo1.6/README.md
================================================
## Quick Start
This section describes how to run YCSB on [Accumulo](https://accumulo.apache.org/).
### 1. Start Accumulo
See the [Accumulo Documentation](https://accumulo.apache.org/1.6/accumulo_user_manual.html#_installation)
for details on installing and running Accumulo.
Before running the YCSB test you must create the Accumulo table. Again see the
[Accumulo Documentation](https://accumulo.apache.org/1.6/accumulo_user_manual.html#_basic_administration)
for details. The default table name is `ycsb`.
### 2. Set Up YCSB
Git clone YCSB and compile:
git clone http://github.com/brianfrankcooper/YCSB.git
cd YCSB
mvn -pl com.yahoo.ycsb:accumulo1.6-binding -am clean package
### 3. Create the Accumulo table
By default, YCSB uses a table with the name "usertable". Users must create this table before loading
data into Accumulo. For maximum Accumulo performance, the Accumulo table must be pre-split. A simple
Ruby script, based on the HBase README, can generate adequate split-point. 10's of Tablets per
TabletServer is a good starting point. Unless otherwise specified, the following commands should run
on any version of Accumulo.
$ echo 'num_splits = 20; puts (1..num_splits).map {|i| "user#{1000+i*(9999-1000)/num_splits}"}' | ruby > /tmp/splits.txt
$ accumulo shell -u -p -e "createtable usertable"
$ accumulo shell -u -p -e "addsplits -t usertable -sf /tmp/splits.txt"
$ accumulo shell -u -p -e "config -t usertable -s table.cache.block.enable=true"
Additionally, there are some other configuration properties which can increase performance. These
can be set on the Accumulo table via the shell after it is created. Setting the table durability
to `flush` relaxes the constraints on data durability during hard power-outages (avoids calls
to fsync). Accumulo defaults table compression to `gzip` which is not particularly fast; `snappy`
is a faster and similarly-efficient option. The mutation queue property controls how many writes
that Accumulo will buffer in memory before performing a flush; this property should be set relative
to the amount of JVM heap the TabletServers are given.
Please note that the `table.durability` and `tserver.total.mutation.queue.max` properties only
exists for >=Accumulo-1.7. There are no concise replacements for these properties in earlier versions.
accumulo> config -s table.durability=flush
accumulo> config -s tserver.total.mutation.queue.max=256M
accumulo> config -t usertable -s table.file.compress.type=snappy
On repeated data loads, the following commands may be helpful to re-set the state of the table quickly.
accumulo> createtable tmp --copy-splits usertable --copy-config usertable
accumulo> deletetable --force usertable
accumulo> renametable tmp usertable
accumulo> compact --wait -t accumulo.metadata
### 4. Load Data and Run Tests
Load the data:
./bin/ycsb load accumulo1.6 -s -P workloads/workloada \
-p accumulo.zooKeepers=localhost \
-p accumulo.columnFamily=ycsb \
-p accumulo.instanceName=ycsb \
-p accumulo.username=user \
-p accumulo.password=supersecret \
> outputLoad.txt
Run the workload test:
./bin/ycsb run accumulo1.6 -s -P workloads/workloada \
-p accumulo.zooKeepers=localhost \
-p accumulo.columnFamily=ycsb \
-p accumulo.instanceName=ycsb \
-p accumulo.username=user \
-p accumulo.password=supersecret \
> outputLoad.txt
## Accumulo Configuration Parameters
- `accumulo.zooKeepers`
- The Accumulo cluster's [zookeeper servers](https://accumulo.apache.org/1.6/accumulo_user_manual.html#_connecting).
- Should contain a comma separated list of of hostname or hostname:port values.
- No default value.
- `accumulo.columnFamily`
- The name of the column family to use to store the data within the table.
- No default value.
- `accumulo.instanceName`
- Name of the Accumulo [instance](https://accumulo.apache.org/1.6/accumulo_user_manual.html#_connecting).
- No default value.
- `accumulo.username`
- The username to use when connecting to Accumulo.
- No default value.
- `accumulo.password`
- The password for the user connecting to Accumulo.
- No default value.
================================================
FILE: accumulo1.6/pom.xml
================================================
4.0.0com.yahoo.ycsbbinding-parent0.14.0-SNAPSHOT../binding-parentaccumulo1.6-bindingAccumulo 1.6 DB Binding2.2.0trueorg.apache.accumuloaccumulo-core${accumulo.1.6.version}org.apache.hadoophadoop-common${hadoop.version}jdk.toolsjdk.toolscom.yahoo.ycsbcore${project.version}providedjunitjunit4.12testorg.apache.accumuloaccumulo-minicluster${accumulo.1.6.version}testorg.slf4jslf4j-api1.7.13../workloadsworkloadssrc/test/resources
================================================
FILE: accumulo1.6/src/main/conf/accumulo.properties
================================================
# Copyright 2014 Cloudera, Inc. or its affiliates. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"); you
# may not use this file except in compliance with the License. You
# may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
# implied. See the License for the specific language governing
# permissions and limitations under the License. See accompanying
# LICENSE file.
#
# Sample Accumulo configuration properties
#
# You may either set properties here or via the command line.
#
# This will influence the keys we write
accumulo.columnFamily=YCSB
# This should be set based on your Accumulo cluster
#accumulo.instanceName=ExampleInstance
# Comma separated list of host:port tuples for the ZooKeeper quorum used
# by your Accumulo cluster
#accumulo.zooKeepers=zoo1.example.com:2181,zoo2.example.com:2181,zoo3.example.com:2181
# This user will need permissions on the table YCSB works against
#accumulo.username=ycsb
#accumulo.password=protectyaneck
# Controls how long our client writer will wait to buffer more data
# measured in milliseconds
accumulo.batchWriterMaxLatency=30000
# Controls how much data our client will attempt to buffer before sending
# measured in bytes
accumulo.batchWriterSize=100000
# Controls how many worker threads our client will use to parallelize writes
accumulo.batchWriterThreads=1
================================================
FILE: accumulo1.6/src/main/java/com/yahoo/ycsb/db/accumulo/AccumuloClient.java
================================================
/**
* Copyright (c) 2011 YCSB++ project, 2014-2016 YCSB contributors.
* All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License"); you
* may not use this file except in compliance with the License. You
* may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
* implied. See the License for the specific language governing
* permissions and limitations under the License. See accompanying
* LICENSE file.
*/
package com.yahoo.ycsb.db.accumulo;
import static java.nio.charset.StandardCharsets.UTF_8;
import java.io.IOException;
import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;
import java.util.Map.Entry;
import java.util.Set;
import java.util.SortedMap;
import java.util.Vector;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.TimeUnit;
import org.apache.accumulo.core.client.AccumuloException;
import org.apache.accumulo.core.client.AccumuloSecurityException;
import org.apache.accumulo.core.client.BatchWriter;
import org.apache.accumulo.core.client.BatchWriterConfig;
import org.apache.accumulo.core.client.Connector;
import org.apache.accumulo.core.client.IteratorSetting;
import org.apache.accumulo.core.client.MutationsRejectedException;
import org.apache.accumulo.core.client.Scanner;
import org.apache.accumulo.core.client.TableNotFoundException;
import org.apache.accumulo.core.client.ZooKeeperInstance;
import org.apache.accumulo.core.client.security.tokens.AuthenticationToken;
import org.apache.accumulo.core.client.security.tokens.PasswordToken;
import org.apache.accumulo.core.data.Key;
import org.apache.accumulo.core.data.Mutation;
import org.apache.accumulo.core.data.Range;
import org.apache.accumulo.core.data.Value;
import org.apache.accumulo.core.iterators.user.WholeRowIterator;
import org.apache.accumulo.core.security.Authorizations;
import org.apache.accumulo.core.util.CleanUp;
import org.apache.hadoop.io.Text;
import com.yahoo.ycsb.ByteArrayByteIterator;
import com.yahoo.ycsb.ByteIterator;
import com.yahoo.ycsb.DB;
import com.yahoo.ycsb.DBException;
import com.yahoo.ycsb.Status;
/**
* Accumulo binding for YCSB.
*/
public class AccumuloClient extends DB {
private ZooKeeperInstance inst;
private Connector connector;
private Text colFam = new Text("");
private byte[] colFamBytes = new byte[0];
private final ConcurrentHashMap writers = new ConcurrentHashMap<>();
static {
Runtime.getRuntime().addShutdownHook(new Thread() {
@Override
public void run() {
CleanUp.shutdownNow();
}
});
}
@Override
public void init() throws DBException {
colFam = new Text(getProperties().getProperty("accumulo.columnFamily"));
colFamBytes = colFam.toString().getBytes(UTF_8);
inst = new ZooKeeperInstance(
getProperties().getProperty("accumulo.instanceName"),
getProperties().getProperty("accumulo.zooKeepers"));
try {
String principal = getProperties().getProperty("accumulo.username");
AuthenticationToken token =
new PasswordToken(getProperties().getProperty("accumulo.password"));
connector = inst.getConnector(principal, token);
} catch (AccumuloException | AccumuloSecurityException e) {
throw new DBException(e);
}
if (!(getProperties().getProperty("accumulo.pcFlag", "none").equals("none"))) {
System.err.println("Sorry, the ZK based producer/consumer implementation has been removed. " +
"Please see YCSB issue #416 for work on adding a general solution to coordinated work.");
}
}
@Override
public void cleanup() throws DBException {
try {
Iterator iterator = writers.values().iterator();
while (iterator.hasNext()) {
BatchWriter writer = iterator.next();
writer.close();
iterator.remove();
}
} catch (MutationsRejectedException e) {
throw new DBException(e);
}
}
/**
* Called when the user specifies a table that isn't the same as the existing
* table. Connect to it and if necessary, close our current connection.
*
* @param table
* The table to open.
*/
public BatchWriter getWriter(String table) throws TableNotFoundException {
// tl;dr We're paying a cost for the ConcurrentHashMap here to deal with the DB api.
// We know that YCSB is really only ever going to send us data for one table, so using
// a concurrent data structure is overkill (especially in such a hot code path).
// However, the impact seems to be relatively negligible in trivial local tests and it's
// "more correct" WRT to the API.
BatchWriter writer = writers.get(table);
if (null == writer) {
BatchWriter newWriter = createBatchWriter(table);
BatchWriter oldWriter = writers.putIfAbsent(table, newWriter);
// Someone beat us to creating a BatchWriter for this table, use their BatchWriters
if (null != oldWriter) {
try {
// Make sure to clean up our new batchwriter!
newWriter.close();
} catch (MutationsRejectedException e) {
throw new RuntimeException(e);
}
writer = oldWriter;
} else {
writer = newWriter;
}
}
return writer;
}
/**
* Creates a BatchWriter with the expected configuration.
*
* @param table The table to write to
*/
private BatchWriter createBatchWriter(String table) throws TableNotFoundException {
BatchWriterConfig bwc = new BatchWriterConfig();
bwc.setMaxLatency(
Long.parseLong(getProperties()
.getProperty("accumulo.batchWriterMaxLatency", "30000")),
TimeUnit.MILLISECONDS);
bwc.setMaxMemory(Long.parseLong(
getProperties().getProperty("accumulo.batchWriterSize", "100000")));
final String numThreadsValue = getProperties().getProperty("accumulo.batchWriterThreads");
// Try to saturate the client machine.
int numThreads = Math.max(1, Runtime.getRuntime().availableProcessors() / 2);
if (null != numThreadsValue) {
numThreads = Integer.parseInt(numThreadsValue);
}
System.err.println("Using " + numThreads + " threads to write data");
bwc.setMaxWriteThreads(numThreads);
return connector.createBatchWriter(table, bwc);
}
/**
* Gets a scanner from Accumulo over one row.
*
* @param row the row to scan
* @param fields the set of columns to scan
* @return an Accumulo {@link Scanner} bound to the given row and columns
*/
private Scanner getRow(String table, Text row, Set fields) throws TableNotFoundException {
Scanner scanner = connector.createScanner(table, Authorizations.EMPTY);
scanner.setRange(new Range(row));
if (fields != null) {
for (String field : fields) {
scanner.fetchColumn(colFam, new Text(field));
}
}
return scanner;
}
@Override
public Status read(String table, String key, Set fields,
Map result) {
Scanner scanner = null;
try {
scanner = getRow(table, new Text(key), null);
// Pick out the results we care about.
final Text cq = new Text();
for (Entry entry : scanner) {
entry.getKey().getColumnQualifier(cq);
Value v = entry.getValue();
byte[] buf = v.get();
result.put(cq.toString(),
new ByteArrayByteIterator(buf));
}
} catch (Exception e) {
System.err.println("Error trying to reading Accumulo table " + table + " " + key);
e.printStackTrace();
return Status.ERROR;
} finally {
if (null != scanner) {
scanner.close();
}
}
return Status.OK;
}
@Override
public Status scan(String table, String startkey, int recordcount,
Set fields, Vector> result) {
// Just make the end 'infinity' and only read as much as we need.
Scanner scanner = null;
try {
scanner = connector.createScanner(table, Authorizations.EMPTY);
scanner.setRange(new Range(new Text(startkey), null));
// Have Accumulo send us complete rows, serialized in a single Key-Value pair
IteratorSetting cfg = new IteratorSetting(100, WholeRowIterator.class);
scanner.addScanIterator(cfg);
// If no fields are provided, we assume one column/row.
if (fields != null) {
// And add each of them as fields we want.
for (String field : fields) {
scanner.fetchColumn(colFam, new Text(field));
}
}
int count = 0;
for (Entry entry : scanner) {
// Deserialize the row
SortedMap row = WholeRowIterator.decodeRow(entry.getKey(), entry.getValue());
HashMap rowData;
if (null != fields) {
rowData = new HashMap<>(fields.size());
} else {
rowData = new HashMap<>();
}
result.add(rowData);
// Parse the data in the row, avoid unnecessary Text object creation
final Text cq = new Text();
for (Entry rowEntry : row.entrySet()) {
rowEntry.getKey().getColumnQualifier(cq);
rowData.put(cq.toString(), new ByteArrayByteIterator(rowEntry.getValue().get()));
}
if (count++ == recordcount) { // Done reading the last row.
break;
}
}
} catch (TableNotFoundException e) {
System.err.println("Error trying to connect to Accumulo table.");
e.printStackTrace();
return Status.ERROR;
} catch (IOException e) {
System.err.println("Error deserializing data from Accumulo.");
e.printStackTrace();
return Status.ERROR;
} finally {
if (null != scanner) {
scanner.close();
}
}
return Status.OK;
}
@Override
public Status update(String table, String key,
Map values) {
BatchWriter bw = null;
try {
bw = getWriter(table);
} catch (TableNotFoundException e) {
System.err.println("Error opening batch writer to Accumulo table " + table);
e.printStackTrace();
return Status.ERROR;
}
Mutation mutInsert = new Mutation(key.getBytes(UTF_8));
for (Map.Entry entry : values.entrySet()) {
mutInsert.put(colFamBytes, entry.getKey().getBytes(UTF_8), entry.getValue().toArray());
}
try {
bw.addMutation(mutInsert);
} catch (MutationsRejectedException e) {
System.err.println("Error performing update.");
e.printStackTrace();
return Status.ERROR;
}
return Status.BATCHED_OK;
}
@Override
public Status insert(String t, String key,
Map values) {
return update(t, key, values);
}
@Override
public Status delete(String table, String key) {
BatchWriter bw;
try {
bw = getWriter(table);
} catch (TableNotFoundException e) {
System.err.println("Error trying to connect to Accumulo table.");
e.printStackTrace();
return Status.ERROR;
}
try {
deleteRow(table, new Text(key), bw);
} catch (TableNotFoundException | MutationsRejectedException e) {
System.err.println("Error performing delete.");
e.printStackTrace();
return Status.ERROR;
} catch (RuntimeException e) {
System.err.println("Error performing delete.");
e.printStackTrace();
return Status.ERROR;
}
return Status.OK;
}
// These functions are adapted from RowOperations.java:
private void deleteRow(String table, Text row, BatchWriter bw) throws MutationsRejectedException,
TableNotFoundException {
// TODO Use a batchDeleter instead
deleteRow(getRow(table, row, null), bw);
}
/**
* Deletes a row, given a Scanner of JUST that row.
*/
private void deleteRow(Scanner scanner, BatchWriter bw) throws MutationsRejectedException {
Mutation deleter = null;
// iterate through the keys
final Text row = new Text();
final Text cf = new Text();
final Text cq = new Text();
for (Entry entry : scanner) {
// create a mutation for the row
if (deleter == null) {
entry.getKey().getRow(row);
deleter = new Mutation(row);
}
entry.getKey().getColumnFamily(cf);
entry.getKey().getColumnQualifier(cq);
// the remove function adds the key with the delete flag set to true
deleter.putDelete(cf, cq);
}
bw.addMutation(deleter);
}
}
================================================
FILE: accumulo1.6/src/main/java/com/yahoo/ycsb/db/accumulo/package-info.java
================================================
/**
* Copyright (c) 2015 YCSB contributors. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License"); you
* may not use this file except in compliance with the License. You
* may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
* implied. See the License for the specific language governing
* permissions and limitations under the License. See accompanying
* LICENSE file.
*/
/**
* YCSB binding for Apache Accumulo.
*/
package com.yahoo.ycsb.db.accumulo;
================================================
FILE: accumulo1.6/src/test/java/com/yahoo/ycsb/db/accumulo/AccumuloTest.java
================================================
/*
* Copyright (c) 2016 YCSB contributors.
* All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License"); you
* may not use this file except in compliance with the License. You
* may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
* implied. See the License for the specific language governing
* permissions and limitations under the License. See accompanying
* LICENSE file.
*/
package com.yahoo.ycsb.db.accumulo;
import static org.junit.Assert.assertEquals;
import static org.junit.Assert.assertTrue;
import static org.junit.Assume.assumeTrue;
import java.util.Map.Entry;
import java.util.Properties;
import com.yahoo.ycsb.Workload;
import com.yahoo.ycsb.DB;
import com.yahoo.ycsb.measurements.Measurements;
import com.yahoo.ycsb.workloads.CoreWorkload;
import org.apache.accumulo.core.client.Connector;
import org.apache.accumulo.core.client.Scanner;
import org.apache.accumulo.core.client.security.tokens.PasswordToken;
import org.apache.accumulo.core.data.Key;
import org.apache.accumulo.core.data.Value;
import org.apache.accumulo.core.security.Authorizations;
import org.apache.accumulo.core.security.TablePermission;
import org.apache.accumulo.minicluster.MiniAccumuloCluster;
import org.junit.After;
import org.junit.AfterClass;
import org.junit.Before;
import org.junit.BeforeClass;
import org.junit.ClassRule;
import org.junit.Rule;
import org.junit.Test;
import org.junit.rules.TemporaryFolder;
import org.junit.rules.TestName;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
/**
* Use an Accumulo MiniCluster to test out basic workload operations with
* the Accumulo binding.
*/
public class AccumuloTest {
private static final Logger LOG = LoggerFactory.getLogger(AccumuloTest.class);
private static final int INSERT_COUNT = 2000;
private static final int TRANSACTION_COUNT = 2000;
@ClassRule
public static TemporaryFolder workingDir = new TemporaryFolder();
@Rule
public TestName test = new TestName();
private static MiniAccumuloCluster cluster;
private static Properties properties;
private Workload workload;
private DB client;
private Properties workloadProps;
private static boolean isWindows() {
final String os = System.getProperty("os.name");
return os.startsWith("Windows");
}
@BeforeClass
public static void setup() throws Exception {
// Minicluster setup fails on Windows with an UnsatisfiedLinkError.
// Skip if windows.
assumeTrue(!isWindows());
cluster = new MiniAccumuloCluster(workingDir.newFolder("accumulo").getAbsoluteFile(), "protectyaneck");
LOG.debug("starting minicluster");
cluster.start();
LOG.debug("creating connection for admin operations.");
// set up the table and user
final Connector admin = cluster.getConnector("root", "protectyaneck");
admin.tableOperations().create(CoreWorkload.TABLENAME_PROPERTY_DEFAULT);
admin.securityOperations().createLocalUser("ycsb", new PasswordToken("protectyaneck"));
admin.securityOperations().grantTablePermission("ycsb", CoreWorkload.TABLENAME_PROPERTY_DEFAULT, TablePermission.READ);
admin.securityOperations().grantTablePermission("ycsb", CoreWorkload.TABLENAME_PROPERTY_DEFAULT, TablePermission.WRITE);
// set properties the binding will read
properties = new Properties();
properties.setProperty("accumulo.zooKeepers", cluster.getZooKeepers());
properties.setProperty("accumulo.instanceName", cluster.getInstanceName());
properties.setProperty("accumulo.columnFamily", "family");
properties.setProperty("accumulo.username", "ycsb");
properties.setProperty("accumulo.password", "protectyaneck");
// cut down the batch writer timeout so that writes will push through.
properties.setProperty("accumulo.batchWriterMaxLatency", "4");
// set these explicitly to the defaults at the time we're compiled, since they'll be inlined in our class.
properties.setProperty(CoreWorkload.TABLENAME_PROPERTY, CoreWorkload.TABLENAME_PROPERTY_DEFAULT);
properties.setProperty(CoreWorkload.FIELD_COUNT_PROPERTY, CoreWorkload.FIELD_COUNT_PROPERTY_DEFAULT);
properties.setProperty(CoreWorkload.INSERT_ORDER_PROPERTY, "ordered");
}
@AfterClass
public static void clusterCleanup() throws Exception {
if (cluster != null) {
LOG.debug("shutting down minicluster");
cluster.stop();
cluster = null;
}
}
@Before
public void client() throws Exception {
LOG.debug("Loading workload properties for {}", test.getMethodName());
workloadProps = new Properties();
workloadProps.load(getClass().getResourceAsStream("/workloads/" + test.getMethodName()));
for (String prop : properties.stringPropertyNames()) {
workloadProps.setProperty(prop, properties.getProperty(prop));
}
// TODO we need a better test rig for 'run this ycsb workload'
LOG.debug("initializing measurements and workload");
Measurements.setProperties(workloadProps);
workload = new CoreWorkload();
workload.init(workloadProps);
LOG.debug("initializing client");
client = new AccumuloClient();
client.setProperties(workloadProps);
client.init();
}
@After
public void cleanup() throws Exception {
if (client != null) {
LOG.debug("cleaning up client");
client.cleanup();
client = null;
}
if (workload != null) {
LOG.debug("cleaning up workload");
workload.cleanup();
}
}
@After
public void truncateTable() throws Exception {
if (cluster != null) {
LOG.debug("truncating table {}", CoreWorkload.TABLENAME_PROPERTY_DEFAULT);
final Connector admin = cluster.getConnector("root", "protectyaneck");
admin.tableOperations().deleteRows(CoreWorkload.TABLENAME_PROPERTY_DEFAULT, null, null);
}
}
@Test
public void workloada() throws Exception {
runWorkload();
}
@Test
public void workloadb() throws Exception {
runWorkload();
}
@Test
public void workloadc() throws Exception {
runWorkload();
}
@Test
public void workloadd() throws Exception {
runWorkload();
}
@Test
public void workloade() throws Exception {
runWorkload();
}
/**
* go through a workload cycle.
*
*
initialize thread-specific state
*
load the workload dataset
*
run workload transactions
*
*/
private void runWorkload() throws Exception {
final Object state = workload.initThread(workloadProps,0,0);
LOG.debug("load");
for (int i = 0; i < INSERT_COUNT; i++) {
assertTrue("insert failed.", workload.doInsert(client, state));
}
// Ensure we wait long enough for the batch writer to flush
// TODO accumulo client should be flushing per insert by default.
Thread.sleep(2000);
LOG.debug("verify number of cells");
final Scanner scanner = cluster.getConnector("root", "protectyaneck").createScanner(CoreWorkload.TABLENAME_PROPERTY_DEFAULT, Authorizations.EMPTY);
int count = 0;
for (Entry entry : scanner) {
count++;
}
assertEquals("Didn't get enough total cells.", (Integer.valueOf(CoreWorkload.FIELD_COUNT_PROPERTY_DEFAULT) * INSERT_COUNT), count);
LOG.debug("run");
for (int i = 0; i < TRANSACTION_COUNT; i++) {
assertTrue("transaction failed.", workload.doTransaction(client, state));
}
}
}
================================================
FILE: accumulo1.6/src/test/resources/log4j.properties
================================================
#
# Copyright (c) 2015 YCSB contributors. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"); you
# may not use this file except in compliance with the License. You
# may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
# implied. See the License for the specific language governing
# permissions and limitations under the License. See accompanying
# LICENSE file.
#
# Root logger option
log4j.rootLogger=INFO, stderr
log4j.appender.stderr=org.apache.log4j.ConsoleAppender
log4j.appender.stderr.target=System.err
log4j.appender.stderr.layout=org.apache.log4j.PatternLayout
log4j.appender.stderr.layout.conversionPattern=%d{yyyy/MM/dd HH:mm:ss} %-5p %c %x - %m%n
# Suppress messages from ZooKeeper
log4j.logger.com.yahoo.ycsb.db.accumulo=INFO
log4j.logger.org.apache.zookeeper=ERROR
log4j.logger.org.apache.accumulo=WARN
================================================
FILE: accumulo1.7/README.md
================================================
## Quick Start
This section describes how to run YCSB on [Accumulo](https://accumulo.apache.org/).
### 1. Start Accumulo
See the [Accumulo Documentation](https://accumulo.apache.org/1.7/accumulo_user_manual.html#_installation)
for details on installing and running Accumulo.
Before running the YCSB test you must create the Accumulo table. Again see the
[Accumulo Documentation](https://accumulo.apache.org/1.7/accumulo_user_manual.html#_basic_administration)
for details. The default table name is `ycsb`.
### 2. Set Up YCSB
Git clone YCSB and compile:
git clone http://github.com/brianfrankcooper/YCSB.git
cd YCSB
mvn -pl com.yahoo.ycsb:accumulo1.7-binding -am clean package
### 3. Create the Accumulo table
By default, YCSB uses a table with the name "usertable". Users must create this table before loading
data into Accumulo. For maximum Accumulo performance, the Accumulo table must be pre-split. A simple
Ruby script, based on the HBase README, can generate adequate split-point. 10's of Tablets per
TabletServer is a good starting point. Unless otherwise specified, the following commands should run
on any version of Accumulo.
$ echo 'num_splits = 20; puts (1..num_splits).map {|i| "user#{1000+i*(9999-1000)/num_splits}"}' | ruby > /tmp/splits.txt
$ accumulo shell -u -p -e "createtable usertable"
$ accumulo shell -u -p -e "addsplits -t usertable -sf /tmp/splits.txt"
$ accumulo shell -u -p -e "config -t usertable -s table.cache.block.enable=true"
Additionally, there are some other configuration properties which can increase performance. These
can be set on the Accumulo table via the shell after it is created. Setting the table durability
to `flush` relaxes the constraints on data durability during hard power-outages (avoids calls
to fsync). Accumulo defaults table compression to `gzip` which is not particularly fast; `snappy`
is a faster and similarly-efficient option. The mutation queue property controls how many writes
that Accumulo will buffer in memory before performing a flush; this property should be set relative
to the amount of JVM heap the TabletServers are given.
Please note that the `table.durability` and `tserver.total.mutation.queue.max` properties only
exists for >=Accumulo-1.7. There are no concise replacements for these properties in earlier versions.
accumulo> config -s table.durability=flush
accumulo> config -s tserver.total.mutation.queue.max=256M
accumulo> config -t usertable -s table.file.compress.type=snappy
On repeated data loads, the following commands may be helpful to re-set the state of the table quickly.
accumulo> createtable tmp --copy-splits usertable --copy-config usertable
accumulo> deletetable --force usertable
accumulo> renametable tmp usertable
accumulo> compact --wait -t accumulo.metadata
### 4. Load Data and Run Tests
Load the data:
./bin/ycsb load accumulo1.7 -s -P workloads/workloada \
-p accumulo.zooKeepers=localhost \
-p accumulo.columnFamily=ycsb \
-p accumulo.instanceName=ycsb \
-p accumulo.username=user \
-p accumulo.password=supersecret \
> outputLoad.txt
Run the workload test:
./bin/ycsb run accumulo1.7 -s -P workloads/workloada \
-p accumulo.zooKeepers=localhost \
-p accumulo.columnFamily=ycsb \
-p accumulo.instanceName=ycsb \
-p accumulo.username=user \
-p accumulo.password=supersecret \
> outputLoad.txt
## Accumulo Configuration Parameters
- `accumulo.zooKeepers`
- The Accumulo cluster's [zookeeper servers](https://accumulo.apache.org/1.7/accumulo_user_manual.html#_connecting).
- Should contain a comma separated list of of hostname or hostname:port values.
- No default value.
- `accumulo.columnFamily`
- The name of the column family to use to store the data within the table.
- No default value.
- `accumulo.instanceName`
- Name of the Accumulo [instance](https://accumulo.apache.org/1.7/accumulo_user_manual.html#_connecting).
- No default value.
- `accumulo.username`
- The username to use when connecting to Accumulo.
- No default value.
- `accumulo.password`
- The password for the user connecting to Accumulo.
- No default value.
================================================
FILE: accumulo1.7/pom.xml
================================================
4.0.0com.yahoo.ycsbbinding-parent0.14.0-SNAPSHOT../binding-parentaccumulo1.7-bindingAccumulo 1.7 DB Binding2.2.0trueorg.apache.accumuloaccumulo-core${accumulo.1.7.version}org.apache.hadoophadoop-common${hadoop.version}jdk.toolsjdk.toolscom.yahoo.ycsbcore${project.version}providedjunitjunit4.12testorg.apache.accumuloaccumulo-minicluster${accumulo.1.7.version}testorg.slf4jslf4j-api1.7.13../workloadsworkloadssrc/test/resources
================================================
FILE: accumulo1.7/src/main/conf/accumulo.properties
================================================
# Copyright 2014 Cloudera, Inc. or its affiliates. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"); you
# may not use this file except in compliance with the License. You
# may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
# implied. See the License for the specific language governing
# permissions and limitations under the License. See accompanying
# LICENSE file.
#
# Sample Accumulo configuration properties
#
# You may either set properties here or via the command line.
#
# This will influence the keys we write
accumulo.columnFamily=YCSB
# This should be set based on your Accumulo cluster
#accumulo.instanceName=ExampleInstance
# Comma separated list of host:port tuples for the ZooKeeper quorum used
# by your Accumulo cluster
#accumulo.zooKeepers=zoo1.example.com:2181,zoo2.example.com:2181,zoo3.example.com:2181
# This user will need permissions on the table YCSB works against
#accumulo.username=ycsb
#accumulo.password=protectyaneck
# Controls how long our client writer will wait to buffer more data
# measured in milliseconds
accumulo.batchWriterMaxLatency=30000
# Controls how much data our client will attempt to buffer before sending
# measured in bytes
accumulo.batchWriterSize=100000
# Controls how many worker threads our client will use to parallelize writes
accumulo.batchWriterThreads=1
================================================
FILE: accumulo1.7/src/main/java/com/yahoo/ycsb/db/accumulo/AccumuloClient.java
================================================
/**
* Copyright (c) 2011 YCSB++ project, 2014-2016 YCSB contributors.
* All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License"); you
* may not use this file except in compliance with the License. You
* may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
* implied. See the License for the specific language governing
* permissions and limitations under the License. See accompanying
* LICENSE file.
*/
package com.yahoo.ycsb.db.accumulo;
import static java.nio.charset.StandardCharsets.UTF_8;
import java.io.IOException;
import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;
import java.util.Map.Entry;
import java.util.Set;
import java.util.SortedMap;
import java.util.Vector;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.TimeUnit;
import org.apache.accumulo.core.client.AccumuloException;
import org.apache.accumulo.core.client.AccumuloSecurityException;
import org.apache.accumulo.core.client.BatchWriter;
import org.apache.accumulo.core.client.BatchWriterConfig;
import org.apache.accumulo.core.client.ClientConfiguration;
import org.apache.accumulo.core.client.Connector;
import org.apache.accumulo.core.client.IteratorSetting;
import org.apache.accumulo.core.client.MutationsRejectedException;
import org.apache.accumulo.core.client.Scanner;
import org.apache.accumulo.core.client.TableNotFoundException;
import org.apache.accumulo.core.client.ZooKeeperInstance;
import org.apache.accumulo.core.client.security.tokens.AuthenticationToken;
import org.apache.accumulo.core.client.security.tokens.PasswordToken;
import org.apache.accumulo.core.data.Key;
import org.apache.accumulo.core.data.Mutation;
import org.apache.accumulo.core.data.Range;
import org.apache.accumulo.core.data.Value;
import org.apache.accumulo.core.iterators.user.WholeRowIterator;
import org.apache.accumulo.core.security.Authorizations;
import org.apache.accumulo.core.util.CleanUp;
import org.apache.hadoop.io.Text;
import com.yahoo.ycsb.ByteArrayByteIterator;
import com.yahoo.ycsb.ByteIterator;
import com.yahoo.ycsb.DB;
import com.yahoo.ycsb.DBException;
import com.yahoo.ycsb.Status;
/**
* Accumulo binding for YCSB.
*/
public class AccumuloClient extends DB {
private ZooKeeperInstance inst;
private Connector connector;
private Text colFam = new Text("");
private byte[] colFamBytes = new byte[0];
private final ConcurrentHashMap writers = new ConcurrentHashMap<>();
static {
Runtime.getRuntime().addShutdownHook(new Thread() {
@Override
public void run() {
CleanUp.shutdownNow();
}
});
}
@Override
public void init() throws DBException {
colFam = new Text(getProperties().getProperty("accumulo.columnFamily"));
colFamBytes = colFam.toString().getBytes(UTF_8);
inst = new ZooKeeperInstance(new ClientConfiguration()
.withInstance(getProperties().getProperty("accumulo.instanceName"))
.withZkHosts(getProperties().getProperty("accumulo.zooKeepers")));
try {
String principal = getProperties().getProperty("accumulo.username");
AuthenticationToken token =
new PasswordToken(getProperties().getProperty("accumulo.password"));
connector = inst.getConnector(principal, token);
} catch (AccumuloException | AccumuloSecurityException e) {
throw new DBException(e);
}
if (!(getProperties().getProperty("accumulo.pcFlag", "none").equals("none"))) {
System.err.println("Sorry, the ZK based producer/consumer implementation has been removed. " +
"Please see YCSB issue #416 for work on adding a general solution to coordinated work.");
}
}
@Override
public void cleanup() throws DBException {
try {
Iterator iterator = writers.values().iterator();
while (iterator.hasNext()) {
BatchWriter writer = iterator.next();
writer.close();
iterator.remove();
}
} catch (MutationsRejectedException e) {
throw new DBException(e);
}
}
/**
* Called when the user specifies a table that isn't the same as the existing
* table. Connect to it and if necessary, close our current connection.
*
* @param table
* The table to open.
*/
public BatchWriter getWriter(String table) throws TableNotFoundException {
// tl;dr We're paying a cost for the ConcurrentHashMap here to deal with the DB api.
// We know that YCSB is really only ever going to send us data for one table, so using
// a concurrent data structure is overkill (especially in such a hot code path).
// However, the impact seems to be relatively negligible in trivial local tests and it's
// "more correct" WRT to the API.
BatchWriter writer = writers.get(table);
if (null == writer) {
BatchWriter newWriter = createBatchWriter(table);
BatchWriter oldWriter = writers.putIfAbsent(table, newWriter);
// Someone beat us to creating a BatchWriter for this table, use their BatchWriters
if (null != oldWriter) {
try {
// Make sure to clean up our new batchwriter!
newWriter.close();
} catch (MutationsRejectedException e) {
throw new RuntimeException(e);
}
writer = oldWriter;
} else {
writer = newWriter;
}
}
return writer;
}
/**
* Creates a BatchWriter with the expected configuration.
*
* @param table The table to write to
*/
private BatchWriter createBatchWriter(String table) throws TableNotFoundException {
BatchWriterConfig bwc = new BatchWriterConfig();
bwc.setMaxLatency(
Long.parseLong(getProperties()
.getProperty("accumulo.batchWriterMaxLatency", "30000")),
TimeUnit.MILLISECONDS);
bwc.setMaxMemory(Long.parseLong(
getProperties().getProperty("accumulo.batchWriterSize", "100000")));
final String numThreadsValue = getProperties().getProperty("accumulo.batchWriterThreads");
// Try to saturate the client machine.
int numThreads = Math.max(1, Runtime.getRuntime().availableProcessors() / 2);
if (null != numThreadsValue) {
numThreads = Integer.parseInt(numThreadsValue);
}
System.err.println("Using " + numThreads + " threads to write data");
bwc.setMaxWriteThreads(numThreads);
return connector.createBatchWriter(table, bwc);
}
/**
* Gets a scanner from Accumulo over one row.
*
* @param row the row to scan
* @param fields the set of columns to scan
* @return an Accumulo {@link Scanner} bound to the given row and columns
*/
private Scanner getRow(String table, Text row, Set fields) throws TableNotFoundException {
Scanner scanner = connector.createScanner(table, Authorizations.EMPTY);
scanner.setRange(new Range(row));
if (fields != null) {
for (String field : fields) {
scanner.fetchColumn(colFam, new Text(field));
}
}
return scanner;
}
@Override
public Status read(String table, String key, Set fields,
Map result) {
Scanner scanner = null;
try {
scanner = getRow(table, new Text(key), null);
// Pick out the results we care about.
final Text cq = new Text();
for (Entry entry : scanner) {
entry.getKey().getColumnQualifier(cq);
Value v = entry.getValue();
byte[] buf = v.get();
result.put(cq.toString(),
new ByteArrayByteIterator(buf));
}
} catch (Exception e) {
System.err.println("Error trying to reading Accumulo table " + table + " " + key);
e.printStackTrace();
return Status.ERROR;
} finally {
if (null != scanner) {
scanner.close();
}
}
return Status.OK;
}
@Override
public Status scan(String table, String startkey, int recordcount,
Set fields, Vector> result) {
// Just make the end 'infinity' and only read as much as we need.
Scanner scanner = null;
try {
scanner = connector.createScanner(table, Authorizations.EMPTY);
scanner.setRange(new Range(new Text(startkey), null));
// Have Accumulo send us complete rows, serialized in a single Key-Value pair
IteratorSetting cfg = new IteratorSetting(100, WholeRowIterator.class);
scanner.addScanIterator(cfg);
// If no fields are provided, we assume one column/row.
if (fields != null) {
// And add each of them as fields we want.
for (String field : fields) {
scanner.fetchColumn(colFam, new Text(field));
}
}
int count = 0;
for (Entry entry : scanner) {
// Deserialize the row
SortedMap row = WholeRowIterator.decodeRow(entry.getKey(), entry.getValue());
HashMap rowData;
if (null != fields) {
rowData = new HashMap<>(fields.size());
} else {
rowData = new HashMap<>();
}
result.add(rowData);
// Parse the data in the row, avoid unnecessary Text object creation
final Text cq = new Text();
for (Entry rowEntry : row.entrySet()) {
rowEntry.getKey().getColumnQualifier(cq);
rowData.put(cq.toString(), new ByteArrayByteIterator(rowEntry.getValue().get()));
}
if (count++ == recordcount) { // Done reading the last row.
break;
}
}
} catch (TableNotFoundException e) {
System.err.println("Error trying to connect to Accumulo table.");
e.printStackTrace();
return Status.ERROR;
} catch (IOException e) {
System.err.println("Error deserializing data from Accumulo.");
e.printStackTrace();
return Status.ERROR;
} finally {
if (null != scanner) {
scanner.close();
}
}
return Status.OK;
}
@Override
public Status update(String table, String key,
Map values) {
BatchWriter bw = null;
try {
bw = getWriter(table);
} catch (TableNotFoundException e) {
System.err.println("Error opening batch writer to Accumulo table " + table);
e.printStackTrace();
return Status.ERROR;
}
Mutation mutInsert = new Mutation(key.getBytes(UTF_8));
for (Map.Entry entry : values.entrySet()) {
mutInsert.put(colFamBytes, entry.getKey().getBytes(UTF_8), entry.getValue().toArray());
}
try {
bw.addMutation(mutInsert);
} catch (MutationsRejectedException e) {
System.err.println("Error performing update.");
e.printStackTrace();
return Status.ERROR;
}
return Status.BATCHED_OK;
}
@Override
public Status insert(String t, String key,
Map values) {
return update(t, key, values);
}
@Override
public Status delete(String table, String key) {
BatchWriter bw;
try {
bw = getWriter(table);
} catch (TableNotFoundException e) {
System.err.println("Error trying to connect to Accumulo table.");
e.printStackTrace();
return Status.ERROR;
}
try {
deleteRow(table, new Text(key), bw);
} catch (TableNotFoundException | MutationsRejectedException e) {
System.err.println("Error performing delete.");
e.printStackTrace();
return Status.ERROR;
} catch (RuntimeException e) {
System.err.println("Error performing delete.");
e.printStackTrace();
return Status.ERROR;
}
return Status.OK;
}
// These functions are adapted from RowOperations.java:
private void deleteRow(String table, Text row, BatchWriter bw) throws MutationsRejectedException,
TableNotFoundException {
// TODO Use a batchDeleter instead
deleteRow(getRow(table, row, null), bw);
}
/**
* Deletes a row, given a Scanner of JUST that row.
*/
private void deleteRow(Scanner scanner, BatchWriter bw) throws MutationsRejectedException {
Mutation deleter = null;
// iterate through the keys
final Text row = new Text();
final Text cf = new Text();
final Text cq = new Text();
for (Entry entry : scanner) {
// create a mutation for the row
if (deleter == null) {
entry.getKey().getRow(row);
deleter = new Mutation(row);
}
entry.getKey().getColumnFamily(cf);
entry.getKey().getColumnQualifier(cq);
// the remove function adds the key with the delete flag set to true
deleter.putDelete(cf, cq);
}
bw.addMutation(deleter);
}
}
================================================
FILE: accumulo1.7/src/main/java/com/yahoo/ycsb/db/accumulo/package-info.java
================================================
/**
* Copyright (c) 2015 YCSB contributors. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License"); you
* may not use this file except in compliance with the License. You
* may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
* implied. See the License for the specific language governing
* permissions and limitations under the License. See accompanying
* LICENSE file.
*/
/**
* YCSB binding for Apache Accumulo.
*/
package com.yahoo.ycsb.db.accumulo;
================================================
FILE: accumulo1.7/src/test/java/com/yahoo/ycsb/db/accumulo/AccumuloTest.java
================================================
/*
* Copyright (c) 2016 YCSB contributors.
* All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License"); you
* may not use this file except in compliance with the License. You
* may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
* implied. See the License for the specific language governing
* permissions and limitations under the License. See accompanying
* LICENSE file.
*/
package com.yahoo.ycsb.db.accumulo;
import static org.junit.Assert.assertEquals;
import static org.junit.Assert.assertTrue;
import static org.junit.Assume.assumeTrue;
import java.util.Map.Entry;
import java.util.Properties;
import com.yahoo.ycsb.Workload;
import com.yahoo.ycsb.DB;
import com.yahoo.ycsb.measurements.Measurements;
import com.yahoo.ycsb.workloads.CoreWorkload;
import org.apache.accumulo.core.client.Connector;
import org.apache.accumulo.core.client.Scanner;
import org.apache.accumulo.core.client.security.tokens.PasswordToken;
import org.apache.accumulo.core.data.Key;
import org.apache.accumulo.core.data.Value;
import org.apache.accumulo.core.security.Authorizations;
import org.apache.accumulo.core.security.TablePermission;
import org.apache.accumulo.minicluster.MiniAccumuloCluster;
import org.junit.After;
import org.junit.AfterClass;
import org.junit.Before;
import org.junit.BeforeClass;
import org.junit.ClassRule;
import org.junit.Rule;
import org.junit.Test;
import org.junit.rules.TemporaryFolder;
import org.junit.rules.TestName;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
/**
* Use an Accumulo MiniCluster to test out basic workload operations with
* the Accumulo binding.
*/
public class AccumuloTest {
private static final Logger LOG = LoggerFactory.getLogger(AccumuloTest.class);
private static final int INSERT_COUNT = 2000;
private static final int TRANSACTION_COUNT = 2000;
@ClassRule
public static TemporaryFolder workingDir = new TemporaryFolder();
@Rule
public TestName test = new TestName();
private static MiniAccumuloCluster cluster;
private static Properties properties;
private Workload workload;
private DB client;
private Properties workloadProps;
private static boolean isWindows() {
final String os = System.getProperty("os.name");
return os.startsWith("Windows");
}
@BeforeClass
public static void setup() throws Exception {
// Minicluster setup fails on Windows with an UnsatisfiedLinkError.
// Skip if windows.
assumeTrue(!isWindows());
cluster = new MiniAccumuloCluster(workingDir.newFolder("accumulo").getAbsoluteFile(), "protectyaneck");
LOG.debug("starting minicluster");
cluster.start();
LOG.debug("creating connection for admin operations.");
// set up the table and user
final Connector admin = cluster.getConnector("root", "protectyaneck");
admin.tableOperations().create(CoreWorkload.TABLENAME_PROPERTY_DEFAULT);
admin.securityOperations().createLocalUser("ycsb", new PasswordToken("protectyaneck"));
admin.securityOperations().grantTablePermission("ycsb", CoreWorkload.TABLENAME_PROPERTY_DEFAULT, TablePermission.READ);
admin.securityOperations().grantTablePermission("ycsb", CoreWorkload.TABLENAME_PROPERTY_DEFAULT, TablePermission.WRITE);
// set properties the binding will read
properties = new Properties();
properties.setProperty("accumulo.zooKeepers", cluster.getZooKeepers());
properties.setProperty("accumulo.instanceName", cluster.getInstanceName());
properties.setProperty("accumulo.columnFamily", "family");
properties.setProperty("accumulo.username", "ycsb");
properties.setProperty("accumulo.password", "protectyaneck");
// cut down the batch writer timeout so that writes will push through.
properties.setProperty("accumulo.batchWriterMaxLatency", "4");
// set these explicitly to the defaults at the time we're compiled, since they'll be inlined in our class.
properties.setProperty(CoreWorkload.TABLENAME_PROPERTY, CoreWorkload.TABLENAME_PROPERTY_DEFAULT);
properties.setProperty(CoreWorkload.FIELD_COUNT_PROPERTY, CoreWorkload.FIELD_COUNT_PROPERTY_DEFAULT);
properties.setProperty(CoreWorkload.INSERT_ORDER_PROPERTY, "ordered");
}
@AfterClass
public static void clusterCleanup() throws Exception {
if (cluster != null) {
LOG.debug("shutting down minicluster");
cluster.stop();
cluster = null;
}
}
@Before
public void client() throws Exception {
LOG.debug("Loading workload properties for {}", test.getMethodName());
workloadProps = new Properties();
workloadProps.load(getClass().getResourceAsStream("/workloads/" + test.getMethodName()));
for (String prop : properties.stringPropertyNames()) {
workloadProps.setProperty(prop, properties.getProperty(prop));
}
// TODO we need a better test rig for 'run this ycsb workload'
LOG.debug("initializing measurements and workload");
Measurements.setProperties(workloadProps);
workload = new CoreWorkload();
workload.init(workloadProps);
LOG.debug("initializing client");
client = new AccumuloClient();
client.setProperties(workloadProps);
client.init();
}
@After
public void cleanup() throws Exception {
if (client != null) {
LOG.debug("cleaning up client");
client.cleanup();
client = null;
}
if (workload != null) {
LOG.debug("cleaning up workload");
workload.cleanup();
}
}
@After
public void truncateTable() throws Exception {
if (cluster != null) {
LOG.debug("truncating table {}", CoreWorkload.TABLENAME_PROPERTY_DEFAULT);
final Connector admin = cluster.getConnector("root", "protectyaneck");
admin.tableOperations().deleteRows(CoreWorkload.TABLENAME_PROPERTY_DEFAULT, null, null);
}
}
@Test
public void workloada() throws Exception {
runWorkload();
}
@Test
public void workloadb() throws Exception {
runWorkload();
}
@Test
public void workloadc() throws Exception {
runWorkload();
}
@Test
public void workloadd() throws Exception {
runWorkload();
}
@Test
public void workloade() throws Exception {
runWorkload();
}
/**
* go through a workload cycle.
*
*
initialize thread-specific state
*
load the workload dataset
*
run workload transactions
*
*/
private void runWorkload() throws Exception {
final Object state = workload.initThread(workloadProps,0,0);
LOG.debug("load");
for (int i = 0; i < INSERT_COUNT; i++) {
assertTrue("insert failed.", workload.doInsert(client, state));
}
// Ensure we wait long enough for the batch writer to flush
// TODO accumulo client should be flushing per insert by default.
Thread.sleep(2000);
LOG.debug("verify number of cells");
final Scanner scanner = cluster.getConnector("root", "protectyaneck").createScanner(CoreWorkload.TABLENAME_PROPERTY_DEFAULT, Authorizations.EMPTY);
int count = 0;
for (Entry entry : scanner) {
count++;
}
assertEquals("Didn't get enough total cells.", (Integer.valueOf(CoreWorkload.FIELD_COUNT_PROPERTY_DEFAULT) * INSERT_COUNT), count);
LOG.debug("run");
for (int i = 0; i < TRANSACTION_COUNT; i++) {
assertTrue("transaction failed.", workload.doTransaction(client, state));
}
}
}
================================================
FILE: accumulo1.7/src/test/resources/log4j.properties
================================================
#
# Copyright (c) 2015 YCSB contributors. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"); you
# may not use this file except in compliance with the License. You
# may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
# implied. See the License for the specific language governing
# permissions and limitations under the License. See accompanying
# LICENSE file.
#
# Root logger option
log4j.rootLogger=INFO, stderr
log4j.appender.stderr=org.apache.log4j.ConsoleAppender
log4j.appender.stderr.target=System.err
log4j.appender.stderr.layout=org.apache.log4j.PatternLayout
log4j.appender.stderr.layout.conversionPattern=%d{yyyy/MM/dd HH:mm:ss} %-5p %c %x - %m%n
# Suppress messages from ZooKeeper
log4j.logger.com.yahoo.ycsb.db.accumulo=DEBUG
log4j.logger.org.apache.zookeeper=ERROR
log4j.logger.org.apache.accumulo=WARN
================================================
FILE: accumulo1.8/README.md
================================================
## Quick Start
This section describes how to run YCSB on [Accumulo](https://accumulo.apache.org/).
### 1. Start Accumulo
See the [Accumulo Documentation](https://accumulo.apache.org/1.8/accumulo_user_manual.html#_installation)
for details on installing and running Accumulo.
Before running the YCSB test you must create the Accumulo table. Again see the
[Accumulo Documentation](https://accumulo.apache.org/1.8/accumulo_user_manual.html#_basic_administration)
for details. The default table name is `ycsb`.
### 2. Set Up YCSB
Git clone YCSB and compile:
git clone http://github.com/brianfrankcooper/YCSB.git
cd YCSB
mvn -pl com.yahoo.ycsb:accumulo1.8-binding -am clean package
### 3. Create the Accumulo table
By default, YCSB uses a table with the name "usertable". Users must create this table before loading
data into Accumulo. For maximum Accumulo performance, the Accumulo table must be pre-split. A simple
Ruby script, based on the HBase README, can generate adequate split-point. 10's of Tablets per
TabletServer is a good starting point. Unless otherwise specified, the following commands should run
on any version of Accumulo.
$ echo 'num_splits = 20; puts (1..num_splits).map {|i| "user#{1000+i*(9999-1000)/num_splits}"}' | ruby > /tmp/splits.txt
$ accumulo shell -u -p -e "createtable usertable"
$ accumulo shell -u -p -e "addsplits -t usertable -sf /tmp/splits.txt"
$ accumulo shell -u -p -e "config -t usertable -s table.cache.block.enable=true"
Additionally, there are some other configuration properties which can increase performance. These
can be set on the Accumulo table via the shell after it is created. Setting the table durability
to `flush` relaxes the constraints on data durability during hard power-outages (avoids calls
to fsync). Accumulo defaults table compression to `gzip` which is not particularly fast; `snappy`
is a faster and similarly-efficient option. The mutation queue property controls how many writes
that Accumulo will buffer in memory before performing a flush; this property should be set relative
to the amount of JVM heap the TabletServers are given.
Please note that the `table.durability` and `tserver.total.mutation.queue.max` properties only
exists for >=Accumulo-1.7. There are no concise replacements for these properties in earlier versions.
accumulo> config -s table.durability=flush
accumulo> config -s tserver.total.mutation.queue.max=256M
accumulo> config -t usertable -s table.file.compress.type=snappy
On repeated data loads, the following commands may be helpful to re-set the state of the table quickly.
accumulo> createtable tmp --copy-splits usertable --copy-config usertable
accumulo> deletetable --force usertable
accumulo> renametable tmp usertable
accumulo> compact --wait -t accumulo.metadata
### 4. Load Data and Run Tests
Load the data:
./bin/ycsb load accumulo1.8 -s -P workloads/workloada \
-p accumulo.zooKeepers=localhost \
-p accumulo.columnFamily=ycsb \
-p accumulo.instanceName=ycsb \
-p accumulo.username=user \
-p accumulo.password=supersecret \
> outputLoad.txt
Run the workload test:
./bin/ycsb run accumulo1.8 -s -P workloads/workloada \
-p accumulo.zooKeepers=localhost \
-p accumulo.columnFamily=ycsb \
-p accumulo.instanceName=ycsb \
-p accumulo.username=user \
-p accumulo.password=supersecret \
> outputLoad.txt
## Accumulo Configuration Parameters
- `accumulo.zooKeepers`
- The Accumulo cluster's [zookeeper servers](https://accumulo.apache.org/1.8/accumulo_user_manual.html#_connecting).
- Should contain a comma separated list of of hostname or hostname:port values.
- No default value.
- `accumulo.columnFamily`
- The name of the column family to use to store the data within the table.
- No default value.
- `accumulo.instanceName`
- Name of the Accumulo [instance](https://accumulo.apache.org/1.8/accumulo_user_manual.html#_connecting).
- No default value.
- `accumulo.username`
- The username to use when connecting to Accumulo.
- No default value.
- `accumulo.password`
- The password for the user connecting to Accumulo.
- No default value.
================================================
FILE: accumulo1.8/pom.xml
================================================
4.0.0com.yahoo.ycsbbinding-parent0.14.0-SNAPSHOT../binding-parentaccumulo1.8-bindingAccumulo 1.8 DB Binding2.6.4trueorg.apache.accumuloaccumulo-core${accumulo.1.8.version}org.apache.hadoophadoop-common${hadoop.version}jdk.toolsjdk.toolscom.yahoo.ycsbcore${project.version}providedjunitjunit4.12testorg.apache.accumuloaccumulo-minicluster${accumulo.1.8.version}testorg.slf4jslf4j-api1.7.13../workloadsworkloadssrc/test/resources
================================================
FILE: accumulo1.8/src/main/conf/accumulo.properties
================================================
# Copyright 2014 Cloudera, Inc. or its affiliates. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"); you
# may not use this file except in compliance with the License. You
# may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
# implied. See the License for the specific language governing
# permissions and limitations under the License. See accompanying
# LICENSE file.
#
# Sample Accumulo configuration properties
#
# You may either set properties here or via the command line.
#
# This will influence the keys we write
accumulo.columnFamily=YCSB
# This should be set based on your Accumulo cluster
#accumulo.instanceName=ExampleInstance
# Comma separated list of host:port tuples for the ZooKeeper quorum used
# by your Accumulo cluster
#accumulo.zooKeepers=zoo1.example.com:2181,zoo2.example.com:2181,zoo3.example.com:2181
# This user will need permissions on the table YCSB works against
#accumulo.username=ycsb
#accumulo.password=protectyaneck
# Controls how long our client writer will wait to buffer more data
# measured in milliseconds
accumulo.batchWriterMaxLatency=30000
# Controls how much data our client will attempt to buffer before sending
# measured in bytes
accumulo.batchWriterSize=100000
# Controls how many worker threads our client will use to parallelize writes
accumulo.batchWriterThreads=1
================================================
FILE: accumulo1.8/src/main/java/com/yahoo/ycsb/db/accumulo/AccumuloClient.java
================================================
/**
* Copyright (c) 2011 YCSB++ project, 2014-2016 YCSB contributors.
* All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License"); you
* may not use this file except in compliance with the License. You
* may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
* implied. See the License for the specific language governing
* permissions and limitations under the License. See accompanying
* LICENSE file.
*/
package com.yahoo.ycsb.db.accumulo;
import static java.nio.charset.StandardCharsets.UTF_8;
import java.io.IOException;
import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;
import java.util.Map.Entry;
import java.util.Set;
import java.util.SortedMap;
import java.util.Vector;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.TimeUnit;
import org.apache.accumulo.core.client.AccumuloException;
import org.apache.accumulo.core.client.AccumuloSecurityException;
import org.apache.accumulo.core.client.BatchWriter;
import org.apache.accumulo.core.client.BatchWriterConfig;
import org.apache.accumulo.core.client.ClientConfiguration;
import org.apache.accumulo.core.client.Connector;
import org.apache.accumulo.core.client.IteratorSetting;
import org.apache.accumulo.core.client.MutationsRejectedException;
import org.apache.accumulo.core.client.Scanner;
import org.apache.accumulo.core.client.TableNotFoundException;
import org.apache.accumulo.core.client.ZooKeeperInstance;
import org.apache.accumulo.core.client.security.tokens.AuthenticationToken;
import org.apache.accumulo.core.client.security.tokens.PasswordToken;
import org.apache.accumulo.core.data.Key;
import org.apache.accumulo.core.data.Mutation;
import org.apache.accumulo.core.data.Range;
import org.apache.accumulo.core.data.Value;
import org.apache.accumulo.core.iterators.user.WholeRowIterator;
import org.apache.accumulo.core.security.Authorizations;
import org.apache.accumulo.core.util.CleanUp;
import org.apache.hadoop.io.Text;
import com.yahoo.ycsb.ByteArrayByteIterator;
import com.yahoo.ycsb.ByteIterator;
import com.yahoo.ycsb.DB;
import com.yahoo.ycsb.DBException;
import com.yahoo.ycsb.Status;
/**
* Accumulo binding for YCSB.
*/
public class AccumuloClient extends DB {
private ZooKeeperInstance inst;
private Connector connector;
private Text colFam = new Text("");
private byte[] colFamBytes = new byte[0];
private final ConcurrentHashMap writers = new ConcurrentHashMap<>();
static {
Runtime.getRuntime().addShutdownHook(new Thread() {
@Override
public void run() {
CleanUp.shutdownNow();
}
});
}
@Override
public void init() throws DBException {
colFam = new Text(getProperties().getProperty("accumulo.columnFamily"));
colFamBytes = colFam.toString().getBytes(UTF_8);
inst = new ZooKeeperInstance(new ClientConfiguration()
.withInstance(getProperties().getProperty("accumulo.instanceName"))
.withZkHosts(getProperties().getProperty("accumulo.zooKeepers")));
try {
String principal = getProperties().getProperty("accumulo.username");
AuthenticationToken token =
new PasswordToken(getProperties().getProperty("accumulo.password"));
connector = inst.getConnector(principal, token);
} catch (AccumuloException | AccumuloSecurityException e) {
throw new DBException(e);
}
if (!(getProperties().getProperty("accumulo.pcFlag", "none").equals("none"))) {
System.err.println("Sorry, the ZK based producer/consumer implementation has been removed. " +
"Please see YCSB issue #416 for work on adding a general solution to coordinated work.");
}
}
@Override
public void cleanup() throws DBException {
try {
Iterator iterator = writers.values().iterator();
while (iterator.hasNext()) {
BatchWriter writer = iterator.next();
writer.close();
iterator.remove();
}
} catch (MutationsRejectedException e) {
throw new DBException(e);
}
}
/**
* Called when the user specifies a table that isn't the same as the existing
* table. Connect to it and if necessary, close our current connection.
*
* @param table
* The table to open.
*/
public BatchWriter getWriter(String table) throws TableNotFoundException {
// tl;dr We're paying a cost for the ConcurrentHashMap here to deal with the DB api.
// We know that YCSB is really only ever going to send us data for one table, so using
// a concurrent data structure is overkill (especially in such a hot code path).
// However, the impact seems to be relatively negligible in trivial local tests and it's
// "more correct" WRT to the API.
BatchWriter writer = writers.get(table);
if (null == writer) {
BatchWriter newWriter = createBatchWriter(table);
BatchWriter oldWriter = writers.putIfAbsent(table, newWriter);
// Someone beat us to creating a BatchWriter for this table, use their BatchWriters
if (null != oldWriter) {
try {
// Make sure to clean up our new batchwriter!
newWriter.close();
} catch (MutationsRejectedException e) {
throw new RuntimeException(e);
}
writer = oldWriter;
} else {
writer = newWriter;
}
}
return writer;
}
/**
* Creates a BatchWriter with the expected configuration.
*
* @param table The table to write to
*/
private BatchWriter createBatchWriter(String table) throws TableNotFoundException {
BatchWriterConfig bwc = new BatchWriterConfig();
bwc.setMaxLatency(
Long.parseLong(getProperties()
.getProperty("accumulo.batchWriterMaxLatency", "30000")),
TimeUnit.MILLISECONDS);
bwc.setMaxMemory(Long.parseLong(
getProperties().getProperty("accumulo.batchWriterSize", "100000")));
final String numThreadsValue = getProperties().getProperty("accumulo.batchWriterThreads");
// Try to saturate the client machine.
int numThreads = Math.max(1, Runtime.getRuntime().availableProcessors() / 2);
if (null != numThreadsValue) {
numThreads = Integer.parseInt(numThreadsValue);
}
System.err.println("Using " + numThreads + " threads to write data");
bwc.setMaxWriteThreads(numThreads);
return connector.createBatchWriter(table, bwc);
}
/**
* Gets a scanner from Accumulo over one row.
*
* @param row the row to scan
* @param fields the set of columns to scan
* @return an Accumulo {@link Scanner} bound to the given row and columns
*/
private Scanner getRow(String table, Text row, Set fields) throws TableNotFoundException {
Scanner scanner = connector.createScanner(table, Authorizations.EMPTY);
scanner.setRange(new Range(row));
if (fields != null) {
for (String field : fields) {
scanner.fetchColumn(colFam, new Text(field));
}
}
return scanner;
}
@Override
public Status read(String table, String key, Set fields,
Map result) {
Scanner scanner = null;
try {
scanner = getRow(table, new Text(key), null);
// Pick out the results we care about.
final Text cq = new Text();
for (Entry entry : scanner) {
entry.getKey().getColumnQualifier(cq);
Value v = entry.getValue();
byte[] buf = v.get();
result.put(cq.toString(),
new ByteArrayByteIterator(buf));
}
} catch (Exception e) {
System.err.println("Error trying to reading Accumulo table " + table + " " + key);
e.printStackTrace();
return Status.ERROR;
} finally {
if (null != scanner) {
scanner.close();
}
}
return Status.OK;
}
@Override
public Status scan(String table, String startkey, int recordcount,
Set fields, Vector> result) {
// Just make the end 'infinity' and only read as much as we need.
Scanner scanner = null;
try {
scanner = connector.createScanner(table, Authorizations.EMPTY);
scanner.setRange(new Range(new Text(startkey), null));
// Have Accumulo send us complete rows, serialized in a single Key-Value pair
IteratorSetting cfg = new IteratorSetting(100, WholeRowIterator.class);
scanner.addScanIterator(cfg);
// If no fields are provided, we assume one column/row.
if (fields != null) {
// And add each of them as fields we want.
for (String field : fields) {
scanner.fetchColumn(colFam, new Text(field));
}
}
int count = 0;
for (Entry entry : scanner) {
// Deserialize the row
SortedMap row = WholeRowIterator.decodeRow(entry.getKey(), entry.getValue());
HashMap rowData;
if (null != fields) {
rowData = new HashMap<>(fields.size());
} else {
rowData = new HashMap<>();
}
result.add(rowData);
// Parse the data in the row, avoid unnecessary Text object creation
final Text cq = new Text();
for (Entry rowEntry : row.entrySet()) {
rowEntry.getKey().getColumnQualifier(cq);
rowData.put(cq.toString(), new ByteArrayByteIterator(rowEntry.getValue().get()));
}
if (count++ == recordcount) { // Done reading the last row.
break;
}
}
} catch (TableNotFoundException e) {
System.err.println("Error trying to connect to Accumulo table.");
e.printStackTrace();
return Status.ERROR;
} catch (IOException e) {
System.err.println("Error deserializing data from Accumulo.");
e.printStackTrace();
return Status.ERROR;
} finally {
if (null != scanner) {
scanner.close();
}
}
return Status.OK;
}
@Override
public Status update(String table, String key,
Map values) {
BatchWriter bw = null;
try {
bw = getWriter(table);
} catch (TableNotFoundException e) {
System.err.println("Error opening batch writer to Accumulo table " + table);
e.printStackTrace();
return Status.ERROR;
}
Mutation mutInsert = new Mutation(key.getBytes(UTF_8));
for (Map.Entry entry : values.entrySet()) {
mutInsert.put(colFamBytes, entry.getKey().getBytes(UTF_8), entry.getValue().toArray());
}
try {
bw.addMutation(mutInsert);
} catch (MutationsRejectedException e) {
System.err.println("Error performing update.");
e.printStackTrace();
return Status.ERROR;
}
return Status.BATCHED_OK;
}
@Override
public Status insert(String t, String key,
Map values) {
return update(t, key, values);
}
@Override
public Status delete(String table, String key) {
BatchWriter bw;
try {
bw = getWriter(table);
} catch (TableNotFoundException e) {
System.err.println("Error trying to connect to Accumulo table.");
e.printStackTrace();
return Status.ERROR;
}
try {
deleteRow(table, new Text(key), bw);
} catch (TableNotFoundException | MutationsRejectedException e) {
System.err.println("Error performing delete.");
e.printStackTrace();
return Status.ERROR;
} catch (RuntimeException e) {
System.err.println("Error performing delete.");
e.printStackTrace();
return Status.ERROR;
}
return Status.OK;
}
// These functions are adapted from RowOperations.java:
private void deleteRow(String table, Text row, BatchWriter bw) throws MutationsRejectedException,
TableNotFoundException {
// TODO Use a batchDeleter instead
deleteRow(getRow(table, row, null), bw);
}
/**
* Deletes a row, given a Scanner of JUST that row.
*/
private void deleteRow(Scanner scanner, BatchWriter bw) throws MutationsRejectedException {
Mutation deleter = null;
// iterate through the keys
final Text row = new Text();
final Text cf = new Text();
final Text cq = new Text();
for (Entry entry : scanner) {
// create a mutation for the row
if (deleter == null) {
entry.getKey().getRow(row);
deleter = new Mutation(row);
}
entry.getKey().getColumnFamily(cf);
entry.getKey().getColumnQualifier(cq);
// the remove function adds the key with the delete flag set to true
deleter.putDelete(cf, cq);
}
bw.addMutation(deleter);
}
}
================================================
FILE: accumulo1.8/src/main/java/com/yahoo/ycsb/db/accumulo/package-info.java
================================================
/**
* Copyright (c) 2015 YCSB contributors. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License"); you
* may not use this file except in compliance with the License. You
* may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
* implied. See the License for the specific language governing
* permissions and limitations under the License. See accompanying
* LICENSE file.
*/
/**
* YCSB binding for Apache Accumulo.
*/
package com.yahoo.ycsb.db.accumulo;
================================================
FILE: accumulo1.8/src/test/java/com/yahoo/ycsb/db/accumulo/AccumuloTest.java
================================================
/*
* Copyright (c) 2016 YCSB contributors.
* All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License"); you
* may not use this file except in compliance with the License. You
* may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
* implied. See the License for the specific language governing
* permissions and limitations under the License. See accompanying
* LICENSE file.
*/
package com.yahoo.ycsb.db.accumulo;
import static org.junit.Assert.assertEquals;
import static org.junit.Assert.assertTrue;
import static org.junit.Assume.assumeTrue;
import java.util.Map.Entry;
import java.util.Properties;
import com.yahoo.ycsb.Workload;
import com.yahoo.ycsb.DB;
import com.yahoo.ycsb.measurements.Measurements;
import com.yahoo.ycsb.workloads.CoreWorkload;
import org.apache.accumulo.core.client.Connector;
import org.apache.accumulo.core.client.Scanner;
import org.apache.accumulo.core.client.security.tokens.PasswordToken;
import org.apache.accumulo.core.data.Key;
import org.apache.accumulo.core.data.Value;
import org.apache.accumulo.core.security.Authorizations;
import org.apache.accumulo.core.security.TablePermission;
import org.apache.accumulo.minicluster.MiniAccumuloCluster;
import org.junit.After;
import org.junit.AfterClass;
import org.junit.Before;
import org.junit.BeforeClass;
import org.junit.ClassRule;
import org.junit.Rule;
import org.junit.Test;
import org.junit.rules.TemporaryFolder;
import org.junit.rules.TestName;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
/**
* Use an Accumulo MiniCluster to test out basic workload operations with
* the Accumulo binding.
*/
public class AccumuloTest {
private static final Logger LOG = LoggerFactory.getLogger(AccumuloTest.class);
private static final int INSERT_COUNT = 2000;
private static final int TRANSACTION_COUNT = 2000;
@ClassRule
public static TemporaryFolder workingDir = new TemporaryFolder();
@Rule
public TestName test = new TestName();
private static MiniAccumuloCluster cluster;
private static Properties properties;
private Workload workload;
private DB client;
private Properties workloadProps;
private static boolean isWindows() {
final String os = System.getProperty("os.name");
return os.startsWith("Windows");
}
@BeforeClass
public static void setup() throws Exception {
// Minicluster setup fails on Windows with an UnsatisfiedLinkError.
// Skip if windows.
assumeTrue(!isWindows());
cluster = new MiniAccumuloCluster(workingDir.newFolder("accumulo").getAbsoluteFile(), "protectyaneck");
LOG.debug("starting minicluster");
cluster.start();
LOG.debug("creating connection for admin operations.");
// set up the table and user
final Connector admin = cluster.getConnector("root", "protectyaneck");
admin.tableOperations().create(CoreWorkload.TABLENAME_PROPERTY_DEFAULT);
admin.securityOperations().createLocalUser("ycsb", new PasswordToken("protectyaneck"));
admin.securityOperations().grantTablePermission("ycsb", CoreWorkload.TABLENAME_PROPERTY_DEFAULT, TablePermission.READ);
admin.securityOperations().grantTablePermission("ycsb", CoreWorkload.TABLENAME_PROPERTY_DEFAULT, TablePermission.WRITE);
// set properties the binding will read
properties = new Properties();
properties.setProperty("accumulo.zooKeepers", cluster.getZooKeepers());
properties.setProperty("accumulo.instanceName", cluster.getInstanceName());
properties.setProperty("accumulo.columnFamily", "family");
properties.setProperty("accumulo.username", "ycsb");
properties.setProperty("accumulo.password", "protectyaneck");
// cut down the batch writer timeout so that writes will push through.
properties.setProperty("accumulo.batchWriterMaxLatency", "4");
// set these explicitly to the defaults at the time we're compiled, since they'll be inlined in our class.
properties.setProperty(CoreWorkload.TABLENAME_PROPERTY, CoreWorkload.TABLENAME_PROPERTY_DEFAULT);
properties.setProperty(CoreWorkload.FIELD_COUNT_PROPERTY, CoreWorkload.FIELD_COUNT_PROPERTY_DEFAULT);
properties.setProperty(CoreWorkload.INSERT_ORDER_PROPERTY, "ordered");
}
@AfterClass
public static void clusterCleanup() throws Exception {
if (cluster != null) {
LOG.debug("shutting down minicluster");
cluster.stop();
cluster = null;
}
}
@Before
public void client() throws Exception {
LOG.debug("Loading workload properties for {}", test.getMethodName());
workloadProps = new Properties();
workloadProps.load(getClass().getResourceAsStream("/workloads/" + test.getMethodName()));
for (String prop : properties.stringPropertyNames()) {
workloadProps.setProperty(prop, properties.getProperty(prop));
}
// TODO we need a better test rig for 'run this ycsb workload'
LOG.debug("initializing measurements and workload");
Measurements.setProperties(workloadProps);
workload = new CoreWorkload();
workload.init(workloadProps);
LOG.debug("initializing client");
client = new AccumuloClient();
client.setProperties(workloadProps);
client.init();
}
@After
public void cleanup() throws Exception {
if (client != null) {
LOG.debug("cleaning up client");
client.cleanup();
client = null;
}
if (workload != null) {
LOG.debug("cleaning up workload");
workload.cleanup();
}
}
@After
public void truncateTable() throws Exception {
if (cluster != null) {
LOG.debug("truncating table {}", CoreWorkload.TABLENAME_PROPERTY_DEFAULT);
final Connector admin = cluster.getConnector("root", "protectyaneck");
admin.tableOperations().deleteRows(CoreWorkload.TABLENAME_PROPERTY_DEFAULT, null, null);
}
}
@Test
public void workloada() throws Exception {
runWorkload();
}
@Test
public void workloadb() throws Exception {
runWorkload();
}
@Test
public void workloadc() throws Exception {
runWorkload();
}
@Test
public void workloadd() throws Exception {
runWorkload();
}
@Test
public void workloade() throws Exception {
runWorkload();
}
/**
* go through a workload cycle.
*
*
initialize thread-specific state
*
load the workload dataset
*
run workload transactions
*
*/
private void runWorkload() throws Exception {
final Object state = workload.initThread(workloadProps,0,0);
LOG.debug("load");
for (int i = 0; i < INSERT_COUNT; i++) {
assertTrue("insert failed.", workload.doInsert(client, state));
}
// Ensure we wait long enough for the batch writer to flush
// TODO accumulo client should be flushing per insert by default.
Thread.sleep(2000);
LOG.debug("verify number of cells");
final Scanner scanner = cluster.getConnector("root", "protectyaneck").createScanner(CoreWorkload.TABLENAME_PROPERTY_DEFAULT, Authorizations.EMPTY);
int count = 0;
for (Entry entry : scanner) {
count++;
}
assertEquals("Didn't get enough total cells.", (Integer.valueOf(CoreWorkload.FIELD_COUNT_PROPERTY_DEFAULT) * INSERT_COUNT), count);
LOG.debug("run");
for (int i = 0; i < TRANSACTION_COUNT; i++) {
assertTrue("transaction failed.", workload.doTransaction(client, state));
}
}
}
================================================
FILE: accumulo1.8/src/test/resources/log4j.properties
================================================
#
# Copyright (c) 2015 YCSB contributors. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"); you
# may not use this file except in compliance with the License. You
# may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
# implied. See the License for the specific language governing
# permissions and limitations under the License. See accompanying
# LICENSE file.
#
# Root logger option
log4j.rootLogger=INFO, stderr
log4j.appender.stderr=org.apache.log4j.ConsoleAppender
log4j.appender.stderr.target=System.err
log4j.appender.stderr.layout=org.apache.log4j.PatternLayout
log4j.appender.stderr.layout.conversionPattern=%d{yyyy/MM/dd HH:mm:ss} %-5p %c %x - %m%n
# Suppress messages from ZooKeeper
log4j.logger.com.yahoo.ycsb.db.accumulo=DEBUG
log4j.logger.org.apache.zookeeper=ERROR
log4j.logger.org.apache.accumulo=WARN
================================================
FILE: aerospike/README.md
================================================
## Quick Start
This section describes how to run YCSB on Aerospike.
### 1. Start Aerospike
### 2. Install Java and Maven
### 3. Set Up YCSB
Git clone YCSB and compile:
git clone http://github.com/brianfrankcooper/YCSB.git
cd YCSB
mvn -pl com.yahoo.ycsb:aerospike-binding -am clean package
### 4. Provide Aerospike Connection Parameters
The following connection parameters are available.
* `as.host` - The Aerospike cluster to connect to (default: `localhost`)
* `as.port` - The port to connect to (default: `3000`)
* `as.user` - The user to connect as (no default)
* `as.password` - The password for the user (no default)
* `as.timeout` - The transaction and connection timeout (in ms, default: `10000`)
* `as.namespace` - The namespace to be used for the benchmark (default: `ycsb`)
Add them to the workload or set them with the shell command, as in:
./bin/ycsb load aerospike -s -P workloads/workloada -p as.timeout=5000 >outputLoad.txt
### 5. Load Data and Run Tests
Load the data:
./bin/ycsb load aerospike -s -P workloads/workloada >outputLoad.txt
Run the workload test:
./bin/ycsb run aerospike -s -P workloads/workloada >outputRun.txt
================================================
FILE: aerospike/pom.xml
================================================
4.0.0com.yahoo.ycsbbinding-parent0.14.0-SNAPSHOT../binding-parentaerospike-bindingAerospike DB Bindingjarcom.aerospikeaerospike-client${aerospike.version}com.yahoo.ycsbcore${project.version}provided
================================================
FILE: aerospike/src/main/java/com/yahoo/ycsb/db/AerospikeClient.java
================================================
/**
* Copyright (c) 2015 YCSB contributors. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License"); you
* may not use this file except in compliance with the License. You
* may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
* implied. See the License for the specific language governing
* permissions and limitations under the License. See accompanying
* LICENSE file.
*/
package com.yahoo.ycsb.db;
import com.aerospike.client.AerospikeException;
import com.aerospike.client.Bin;
import com.aerospike.client.Key;
import com.aerospike.client.Record;
import com.aerospike.client.policy.ClientPolicy;
import com.aerospike.client.policy.Policy;
import com.aerospike.client.policy.RecordExistsAction;
import com.aerospike.client.policy.WritePolicy;
import com.yahoo.ycsb.ByteArrayByteIterator;
import com.yahoo.ycsb.ByteIterator;
import com.yahoo.ycsb.DBException;
import com.yahoo.ycsb.Status;
import java.util.HashMap;
import java.util.Map;
import java.util.Properties;
import java.util.Set;
import java.util.Vector;
/**
* YCSB binding for Areospike.
*/
public class AerospikeClient extends com.yahoo.ycsb.DB {
private static final String DEFAULT_HOST = "localhost";
private static final String DEFAULT_PORT = "3000";
private static final String DEFAULT_TIMEOUT = "10000";
private static final String DEFAULT_NAMESPACE = "ycsb";
private String namespace = null;
private com.aerospike.client.AerospikeClient client = null;
private Policy readPolicy = new Policy();
private WritePolicy insertPolicy = new WritePolicy();
private WritePolicy updatePolicy = new WritePolicy();
private WritePolicy deletePolicy = new WritePolicy();
@Override
public void init() throws DBException {
insertPolicy.recordExistsAction = RecordExistsAction.CREATE_ONLY;
updatePolicy.recordExistsAction = RecordExistsAction.REPLACE_ONLY;
Properties props = getProperties();
namespace = props.getProperty("as.namespace", DEFAULT_NAMESPACE);
String host = props.getProperty("as.host", DEFAULT_HOST);
String user = props.getProperty("as.user");
String password = props.getProperty("as.password");
int port = Integer.parseInt(props.getProperty("as.port", DEFAULT_PORT));
int timeout = Integer.parseInt(props.getProperty("as.timeout",
DEFAULT_TIMEOUT));
readPolicy.timeout = timeout;
insertPolicy.timeout = timeout;
updatePolicy.timeout = timeout;
deletePolicy.timeout = timeout;
ClientPolicy clientPolicy = new ClientPolicy();
if (user != null && password != null) {
clientPolicy.user = user;
clientPolicy.password = password;
}
try {
client =
new com.aerospike.client.AerospikeClient(clientPolicy, host, port);
} catch (AerospikeException e) {
throw new DBException(String.format("Error while creating Aerospike " +
"client for %s:%d.", host, port), e);
}
}
@Override
public void cleanup() throws DBException {
client.close();
}
@Override
public Status read(String table, String key, Set fields,
Map result) {
try {
Record record;
if (fields != null) {
record = client.get(readPolicy, new Key(namespace, table, key),
fields.toArray(new String[fields.size()]));
} else {
record = client.get(readPolicy, new Key(namespace, table, key));
}
if (record == null) {
System.err.println("Record key " + key + " not found (read)");
return Status.ERROR;
}
for (Map.Entry entry: record.bins.entrySet()) {
result.put(entry.getKey(),
new ByteArrayByteIterator((byte[])entry.getValue()));
}
return Status.OK;
} catch (AerospikeException e) {
System.err.println("Error while reading key " + key + ": " + e);
return Status.ERROR;
}
}
@Override
public Status scan(String table, String start, int count, Set fields,
Vector> result) {
System.err.println("Scan not implemented");
return Status.ERROR;
}
private Status write(String table, String key, WritePolicy writePolicy,
Map values) {
Bin[] bins = new Bin[values.size()];
int index = 0;
for (Map.Entry entry: values.entrySet()) {
bins[index] = new Bin(entry.getKey(), entry.getValue().toArray());
++index;
}
Key keyObj = new Key(namespace, table, key);
try {
client.put(writePolicy, keyObj, bins);
return Status.OK;
} catch (AerospikeException e) {
System.err.println("Error while writing key " + key + ": " + e);
return Status.ERROR;
}
}
@Override
public Status update(String table, String key,
Map values) {
return write(table, key, updatePolicy, values);
}
@Override
public Status insert(String table, String key,
Map values) {
return write(table, key, insertPolicy, values);
}
@Override
public Status delete(String table, String key) {
try {
if (!client.delete(deletePolicy, new Key(namespace, table, key))) {
System.err.println("Record key " + key + " not found (delete)");
return Status.ERROR;
}
return Status.OK;
} catch (AerospikeException e) {
System.err.println("Error while deleting key " + key + ": " + e);
return Status.ERROR;
}
}
}
================================================
FILE: aerospike/src/main/java/com/yahoo/ycsb/db/package-info.java
================================================
/**
* Copyright (c) 2015 YCSB contributors. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License"); you
* may not use this file except in compliance with the License. You
* may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
* implied. See the License for the specific language governing
* permissions and limitations under the License. See accompanying
* LICENSE file.
*/
/**
* YCSB binding for Areospike.
*/
package com.yahoo.ycsb.db;
================================================
FILE: arangodb/.gitignore
================================================
/bin/
================================================
FILE: arangodb/README.md
================================================
## Quick Start
This section describes how to run YCSB on ArangoDB.
### 1. Start ArangoDB
See https://docs.arangodb.com/Installing/index.html
### 2. Install Java and Maven
Go to http://www.oracle.com/technetwork/java/javase/downloads/index.html
and get the url to download the rpm into your server. For example:
wget http://download.oracle.com/otn-pub/java/jdk/7u40-b43/jdk-7u40-linux-x64.rpm?AuthParam=11232426132 -o jdk-7u40-linux-x64.rpm
rpm -Uvh jdk-7u40-linux-x64.rpm
Or install via yum/apt-get
sudo yum install java-devel
Download MVN from http://maven.apache.org/download.cgi
wget http://ftp.heanet.ie/mirrors/www.apache.org/dist/maven/maven-3/3.1.1/binaries/apache-maven-3.1.1-bin.tar.gz
sudo tar xzf apache-maven-*-bin.tar.gz -C /usr/local
cd /usr/local
sudo ln -s apache-maven-* maven
sudo vi /etc/profile.d/maven.sh
Add the following to `maven.sh`
export M2_HOME=/usr/local/maven
export PATH=${M2_HOME}/bin:${PATH}
Reload bash and test mvn
bash
mvn -version
### 3. Set Up YCSB
Clone this YCSB source code:
git clone https://github.com/brianfrankcooper/YCSB.git
### 4. Run YCSB
Now you are ready to run! First, drop the existing collection: "usertable" under database "ycsb":
db._collection("usertable").drop()
Then, load the data:
./bin/ycsb load arangodb -s -P workloads/workloada -p arangodb.ip=xxx -p arangodb.port=xxx
Then, run the workload:
./bin/ycsb run arangodb -s -P workloads/workloada -p arangodb.ip=xxx -p arangodb.port=xxx
See the next section for the list of configuration parameters for ArangoDB.
## ArangoDB Configuration Parameters
- `arangodb.ip`
- Default value is `localhost`
- `arangodb.port`
- Default value is `8529`.
- `arangodb.waitForSync`
- Default value is `true`.
- `arangodb.transactionUpdate`
- Default value is `false`.
- `arangodb.dropDBBeforeRun`
- Default value is `false`.
================================================
FILE: arangodb/conf/logback.xml
================================================
%d{yyyy-MM-dd HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n
================================================
FILE: arangodb/pom.xml
================================================
4.0.0com.yahoo.ycsbbinding-parent0.14.0-SNAPSHOT../binding-parentarangodb-bindingArangoDB Bindingjarcom.arangodbarangodb-java-driver${arangodb.version}com.yahoo.ycsbcore${project.version}providedorg.slf4jslf4j-api1.7.13jarcompilech.qos.logbacklogback-classic1.1.3jarprovidedch.qos.logbacklogback-core1.1.3jarprovidedjunitjunit4.12test
================================================
FILE: arangodb/src/main/java/com/yahoo/ycsb/db/ArangoDBClient.java
================================================
/**
* Copyright (c) 2012 - 2015 YCSB contributors. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License"); you
* may not use this file except in compliance with the License. You
* may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
* implied. See the License for the specific language governing
* permissions and limitations under the License. See accompanying
* LICENSE file.
*/
package com.yahoo.ycsb.db;
import com.arangodb.ArangoConfigure;
import com.arangodb.ArangoDriver;
import com.arangodb.ArangoException;
import com.arangodb.ArangoHost;
import com.arangodb.DocumentCursor;
import com.arangodb.ErrorNums;
import com.arangodb.entity.BaseDocument;
import com.arangodb.entity.DocumentEntity;
import com.arangodb.entity.EntityFactory;
import com.arangodb.entity.TransactionEntity;
import com.arangodb.util.MapBuilder;
import com.yahoo.ycsb.DB;
import com.yahoo.ycsb.Status;
import com.yahoo.ycsb.DBException;
import com.yahoo.ycsb.ByteIterator;
import com.yahoo.ycsb.StringByteIterator;
import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;
import java.util.Properties;
import java.util.Set;
import java.util.Vector;
import java.util.concurrent.atomic.AtomicInteger;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
/**
* ArangoDB binding for YCSB framework using the ArangoDB Inc. driver
*
* See the README.md for configuration information.
*
*
* @see ArangoDB Inc.
* driver
*/
public class ArangoDBClient extends DB {
private static Logger logger = LoggerFactory.getLogger(ArangoDBClient.class);
/**
* The database name to access.
*/
private static String databaseName = "ycsb";
/**
* Count the number of times initialized to teardown on the last
* {@link #cleanup()}.
*/
private static final AtomicInteger INIT_COUNT = new AtomicInteger(0);
/** ArangoDB Driver related, Singleton. */
private static ArangoDriver arangoDriver;
private static Boolean dropDBBeforeRun;
private static Boolean waitForSync = true;
private static Boolean transactionUpdate = false;
/**
* Initialize any state for this DB. Called once per DB instance; there is
* one DB instance per client thread.
*
* Actually, one client process will share one DB instance here.(Coincide to
* mongoDB driver)
*/
@Override
public void init() throws DBException {
INIT_COUNT.incrementAndGet();
synchronized (ArangoDBClient.class) {
if (arangoDriver != null) {
return;
}
Properties props = getProperties();
// Set the DB address
String ip = props.getProperty("arangodb.ip", "localhost");
String portStr = props.getProperty("arangodb.port", "8529");
int port = Integer.parseInt(portStr);
// If clear db before run
String dropDBBeforeRunStr = props.getProperty("arangodb.dropDBBeforeRun", "false");
dropDBBeforeRun = Boolean.parseBoolean(dropDBBeforeRunStr);
// Set the sync mode
String waitForSyncStr = props.getProperty("arangodb.waitForSync", "false");
waitForSync = Boolean.parseBoolean(waitForSyncStr);
// Set if transaction for update
String transactionUpdateStr = props.getProperty("arangodb.transactionUpdate", "false");
transactionUpdate = Boolean.parseBoolean(transactionUpdateStr);
// Init ArangoDB connection
try {
ArangoConfigure arangoConfigure = new ArangoConfigure();
arangoConfigure.setArangoHost(new ArangoHost(ip, port));
arangoConfigure.init();
arangoDriver = new ArangoDriver(arangoConfigure);
} catch (Exception e) {
logger.error("Failed to initialize ArangoDB", e);
System.exit(-1);
}
// Init the database
if (dropDBBeforeRun) {
// Try delete first
try {
arangoDriver.deleteDatabase(databaseName);
} catch (ArangoException e) {
if (e.getErrorNumber() != ErrorNums.ERROR_ARANGO_DATABASE_NOT_FOUND) {
logger.error("Failed to delete database: {} with ex: {}", databaseName, e.toString());
System.exit(-1);
} else {
logger.info("Fail to delete DB, already deleted: {}", databaseName);
}
}
}
try {
arangoDriver.createDatabase(databaseName);
logger.info("Database created: " + databaseName);
} catch (ArangoException e) {
if (e.getErrorNumber() != ErrorNums.ERROR_ARANGO_DUPLICATE_NAME) {
logger.error("Failed to create database: {} with ex: {}", databaseName, e.toString());
System.exit(-1);
} else {
logger.info("DB already exists: {}", databaseName);
}
}
// Always set the default db
arangoDriver.setDefaultDatabase(databaseName);
logger.info("ArangoDB client connection created to {}:{}", ip, port);
// Log the configuration
logger.info("Arango Configuration: dropDBBeforeRun: {}; address: {}:{}; databaseName: {};"
+ " waitForSync: {}; transactionUpdate: {};",
dropDBBeforeRun, ip, port, databaseName, waitForSync, transactionUpdate);
}
}
/**
* Cleanup any state for this DB. Called once per DB instance; there is one
* DB instance per client thread.
*
* Actually, one client process will share one DB instance here.(Coincide to
* mongoDB driver)
*/
@Override
public void cleanup() throws DBException {
if (INIT_COUNT.decrementAndGet() == 0) {
arangoDriver = null;
logger.info("Local cleaned up.");
}
}
/**
* Insert a record in the database. Any field/value pairs in the specified
* values HashMap will be written into the record with the specified record
* key.
*
* @param table
* The name of the table
* @param key
* The record key of the record to insert.
* @param values
* A HashMap of field/value pairs to insert in the record
* @return Zero on success, a non-zero error code on error. See the
* {@link DB} class's description for a discussion of error codes.
*/
@Override
public Status insert(String table, String key, Map values) {
try {
BaseDocument toInsert = new BaseDocument(key);
for (Map.Entry entry : values.entrySet()) {
toInsert.addAttribute(entry.getKey(), byteIteratorToString(entry.getValue()));
}
arangoDriver.createDocument(table, toInsert, true/*create collection if not exist*/,
waitForSync);
return Status.OK;
} catch (ArangoException e) {
if (e.getErrorNumber() != ErrorNums.ERROR_ARANGO_UNIQUE_CONSTRAINT_VIOLATED) {
logger.error("Fail to insert: {} {} with ex {}", table, key, e.toString());
} else {
logger.debug("Trying to create document with duplicate key: {} {}", table, key);
return Status.BAD_REQUEST;
}
} catch (RuntimeException e) {
logger.error("Exception while trying insert {} {} with ex {}", table, key, e.toString());
}
return Status.ERROR;
}
/**
* Read a record from the database. Each field/value pair from the result
* will be stored in a HashMap.
*
* @param table
* The name of the table
* @param key
* The record key of the record to read.
* @param fields
* The list of fields to read, or null for all of them
* @param result
* A HashMap of field/value pairs for the result
* @return Zero on success, a non-zero error code on error or "not found".
*/
@SuppressWarnings("unchecked")
@Override
public Status read(String table, String key, Set fields, Map result) {
try {
DocumentEntity targetDoc = arangoDriver.getDocument(table, key, BaseDocument.class);
BaseDocument aDocument = targetDoc.getEntity();
if (!this.fillMap(result, aDocument.getProperties(), fields)) {
return Status.ERROR;
}
return Status.OK;
} catch (ArangoException e) {
if (e.getErrorNumber() != ErrorNums.ERROR_ARANGO_DOCUMENT_NOT_FOUND) {
logger.error("Fail to read: {} {} with ex {}", table, key, e.toString());
} else {
logger.debug("Trying to read document not exist: {} {}", table, key);
return Status.NOT_FOUND;
}
} catch (RuntimeException e) {
logger.error("Exception while trying read {} {} with ex {}", table, key, e.toString());
}
return Status.ERROR;
}
/**
* Update a record in the database. Any field/value pairs in the specified
* values HashMap will be written into the record with the specified record
* key, overwriting any existing values with the same field name.
*
* @param table
* The name of the table
* @param key
* The record key of the record to write.
* @param values
* A HashMap of field/value pairs to update in the record
* @return Zero on success, a non-zero error code on error. See this class's
* description for a discussion of error codes.
*/
@Override
public Status update(String table, String key, Map values) {
try {
if (!transactionUpdate) {
BaseDocument updateDoc = new BaseDocument();
for (String field : values.keySet()) {
updateDoc.addAttribute(field, byteIteratorToString(values.get(field)));
}
arangoDriver.updateDocument(table, key, updateDoc);
return Status.OK;
} else {
// id for documentHandle
String transactionAction = "function (id) {"
// use internal database functions
+ "var db = require('internal').db;"
// collection.update(document, data, overwrite, keepNull, waitForSync)
+ String.format("db._update(id, %s, true, false, %s);}",
mapToJson(values), Boolean.toString(waitForSync).toLowerCase());
TransactionEntity transaction = arangoDriver.createTransaction(transactionAction);
transaction.addWriteCollection(table);
transaction.setParams(createDocumentHandle(table, key));
arangoDriver.executeTransaction(transaction);
return Status.OK;
}
} catch (ArangoException e) {
if (e.getErrorNumber() != ErrorNums.ERROR_ARANGO_DOCUMENT_NOT_FOUND) {
logger.error("Fail to update: {} {} with ex {}", table, key, e.toString());
} else {
logger.debug("Trying to update document not exist: {} {}", table, key);
return Status.NOT_FOUND;
}
} catch (RuntimeException e) {
logger.error("Exception while trying update {} {} with ex {}", table, key, e.toString());
}
return Status.ERROR;
}
/**
* Delete a record from the database.
*
* @param table
* The name of the table
* @param key
* The record key of the record to delete.
* @return Zero on success, a non-zero error code on error. See the
* {@link DB} class's description for a discussion of error codes.
*/
@Override
public Status delete(String table, String key) {
try {
arangoDriver.deleteDocument(table, key);
return Status.OK;
} catch (ArangoException e) {
if (e.getErrorNumber() != ErrorNums.ERROR_ARANGO_DOCUMENT_NOT_FOUND) {
logger.error("Fail to delete: {} {} with ex {}", table, key, e.toString());
} else {
logger.debug("Trying to delete document not exist: {} {}", table, key);
return Status.NOT_FOUND;
}
} catch (RuntimeException e) {
logger.error("Exception while trying delete {} {} with ex {}", table, key, e.toString());
}
return Status.ERROR;
}
/**
* Perform a range scan for a set of records in the database. Each
* field/value pair from the result will be stored in a HashMap.
*
* @param table
* The name of the table
* @param startkey
* The record key of the first record to read.
* @param recordcount
* The number of records to read
* @param fields
* The list of fields to read, or null for all of them
* @param result
* A Vector of HashMaps, where each HashMap is a set field/value
* pairs for one record
* @return Zero on success, a non-zero error code on error. See the
* {@link DB} class's description for a discussion of error codes.
*/
@Override
public Status scan(String table, String startkey, int recordcount, Set fields,
Vector> result) {
DocumentCursor cursor = null;
try {
String aqlQuery = String.format(
"FOR target IN %s FILTER target._key >= @key SORT target._key ASC LIMIT %d RETURN %s ", table,
recordcount, constructReturnForAQL(fields, "target"));
Map bindVars = new MapBuilder().put("key", startkey).get();
cursor = arangoDriver.executeDocumentQuery(aqlQuery, bindVars, null, BaseDocument.class);
Iterator iterator = cursor.entityIterator();
while (iterator.hasNext()) {
BaseDocument aDocument = iterator.next();
HashMap aMap = new HashMap(aDocument.getProperties().size());
if (!this.fillMap(aMap, aDocument.getProperties())) {
return Status.ERROR;
}
result.add(aMap);
}
return Status.OK;
} catch (Exception e) {
logger.error("Exception while trying scan {} {} {} with ex {}", table, startkey, recordcount, e.toString());
} finally {
if (cursor != null) {
try {
cursor.close();
} catch (ArangoException e) {
logger.error("Fail to close cursor", e);
}
}
}
return Status.ERROR;
}
private String createDocumentHandle(String collectionName, String documentKey) throws ArangoException {
validateCollectionName(collectionName);
return collectionName + "/" + documentKey;
}
private void validateCollectionName(String name) throws ArangoException {
if (name.indexOf('/') != -1) {
throw new ArangoException("does not allow '/' in name.");
}
}
private String constructReturnForAQL(Set fields, String targetName) {
// Construct the AQL query string.
String resultDes = targetName;
if (fields != null && fields.size() != 0) {
StringBuilder builder = new StringBuilder("{");
for (String field : fields) {
builder.append(String.format("\n\"%s\" : %s.%s,", field, targetName, field));
}
//Replace last ',' to newline.
builder.setCharAt(builder.length() - 1, '\n');
builder.append("}");
resultDes = builder.toString();
}
return resultDes;
}
private boolean fillMap(Map resultMap, Map properties) {
return fillMap(resultMap, properties, null);
}
/**
* Fills the map with the properties from the BaseDocument.
*
* @param resultMap
* The map to fill/
* @param obj
* The object to copy values from.
* @return isSuccess
*/
@SuppressWarnings("unchecked")
private boolean fillMap(Map resultMap, Map properties, Set fields) {
if (fields == null || fields.size() == 0) {
for (Map.Entry entry : properties.entrySet()) {
if (entry.getValue() instanceof String) {
resultMap.put(entry.getKey(),
stringToByteIterator((String)(entry.getValue())));
} else {
logger.error("Error! Not the format expected! Actually is {}",
entry.getValue().getClass().getName());
return false;
}
}
} else {
for (String field : fields) {
if (properties.get(field) instanceof String) {
resultMap.put(field, stringToByteIterator((String)(properties.get(field))));
} else {
logger.error("Error! Not the format expected! Actually is {}",
properties.get(field).getClass().getName());
return false;
}
}
}
return true;
}
private String byteIteratorToString(ByteIterator byteIter) {
return new String(byteIter.toArray());
}
private ByteIterator stringToByteIterator(String content) {
return new StringByteIterator(content);
}
private String mapToJson(Map values) {
Map intervalRst = new HashMap();
for (Map.Entry entry : values.entrySet()) {
intervalRst.put(entry.getKey(), byteIteratorToString(entry.getValue()));
}
return EntityFactory.toJsonString(intervalRst);
}
}
================================================
FILE: arangodb/src/main/java/com/yahoo/ycsb/db/package-info.java
================================================
/**
* Copyright (c) 2012 - 2015 YCSB contributors. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License"); you
* may not use this file except in compliance with the License. You
* may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
* implied. See the License for the specific language governing
* permissions and limitations under the License. See accompanying
* LICENSE file.
*/
/**
* The YCSB binding for ArangoDB.
*/
package com.yahoo.ycsb.db;
================================================
FILE: arangodb3/.gitignore
================================================
================================================
FILE: arangodb3/README.md
================================================
## Quick Start
This section describes how to run YCSB on ArangoDB.
### 1. Start ArangoDB
See https://docs.arangodb.com/Installing/index.html
### 2. Install Java and Maven
Go to http://www.oracle.com/technetwork/java/javase/downloads/index.html
and get the url to download the rpm into your server. For example:
wget http://download.oracle.com/otn-pub/java/jdk/7u40-b43/jdk-7u40-linux-x64.rpm?AuthParam=11232426132 -o jdk-7u40-linux-x64.rpm
rpm -Uvh jdk-7u40-linux-x64.rpm
Or install via yum/apt-get
sudo yum install java-devel
Download MVN from http://maven.apache.org/download.cgi
wget http://ftp.heanet.ie/mirrors/www.apache.org/dist/maven/maven-3/3.1.1/binaries/apache-maven-3.1.1-bin.tar.gz
sudo tar xzf apache-maven-*-bin.tar.gz -C /usr/local
cd /usr/local
sudo ln -s apache-maven-* maven
sudo vi /etc/profile.d/maven.sh
Add the following to `maven.sh`
export M2_HOME=/usr/local/maven
export PATH=${M2_HOME}/bin:${PATH}
Reload bash and test mvn
bash
mvn -version
### 3. Set Up YCSB
Clone this YCSB source code:
git clone https://github.com/brianfrankcooper/YCSB.git
### 4. Run YCSB
Now you are ready to run! First, drop the existing collection: "usertable" under database "ycsb":
db._collection("usertable").drop()
Then, load the data:
./bin/ycsb load arangodb3 -s -P workloads/workloada -p arangodb.ip=xxx -p arangodb.port=xxx
Then, run the workload:
./bin/ycsb run arangodb3 -s -P workloads/workloada -p arangodb.ip=xxx -p arangodb.port=xxx
See the next section for the list of configuration parameters for ArangoDB.
## ArangoDB Configuration Parameters
- `arangodb.ip`
- Default value is `localhost`
- `arangodb.port`
- Default value is `8529`.
- `arangodb.waitForSync`
- Default value is `true`.
- `arangodb.transactionUpdate`
- Default value is `false`.
- `arangodb.dropDBBeforeRun`
- Default value is `false`.
================================================
FILE: arangodb3/conf/logback.xml
================================================
%d{yyyy-MM-dd HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n
================================================
FILE: arangodb3/pom.xml
================================================
4.0.0com.yahoo.ycsbbinding-parent0.14.0-SNAPSHOT../binding-parentarangodb3-bindingArangoDB3 Bindingjarcom.arangodbarangodb-java-driver${arangodb3.version}com.yahoo.ycsbcore${project.version}providedorg.slf4jslf4j-api1.7.13jarcompilech.qos.logbacklogback-classic1.1.3jarprovidedch.qos.logbacklogback-core1.1.3jarprovidedjunitjunit4.12test
================================================
FILE: arangodb3/src/main/java/com/yahoo/ycsb/db/arangodb/ArangoDB3Client.java
================================================
/**
* Copyright (c) 2017 YCSB contributors. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License"); you
* may not use this file except in compliance with the License. You
* may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
* implied. See the License for the specific language governing
* permissions and limitations under the License. See accompanying
* LICENSE file.
*/
package com.yahoo.ycsb.db.arangodb;
import java.io.IOException;
import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;
import java.util.Properties;
import java.util.Set;
import java.util.Vector;
import java.util.Map.Entry;
import java.util.concurrent.atomic.AtomicInteger;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import com.arangodb.ArangoCursor;
import com.arangodb.ArangoDB;
import com.arangodb.ArangoDBException;
import com.arangodb.entity.BaseDocument;
import com.arangodb.model.DocumentCreateOptions;
import com.arangodb.model.TransactionOptions;
import com.arangodb.util.MapBuilder;
import com.arangodb.velocypack.VPackBuilder;
import com.arangodb.velocypack.VPackSlice;
import com.arangodb.velocypack.ValueType;
import com.yahoo.ycsb.ByteIterator;
import com.yahoo.ycsb.DB;
import com.yahoo.ycsb.DBException;
import com.yahoo.ycsb.Status;
import com.yahoo.ycsb.StringByteIterator;
/**
* ArangoDB binding for YCSB framework using the ArangoDB Inc. driver
*
* See the README.md for configuration information.
*
*
* @see ArangoDB Inc.
* driver
*/
public class ArangoDB3Client extends DB {
private static Logger logger = LoggerFactory.getLogger(ArangoDB3Client.class);
/**
* Count the number of times initialized to teardown on the last
* {@link #cleanup()}.
*/
private static final AtomicInteger INIT_COUNT = new AtomicInteger(0);
/** ArangoDB Driver related, Singleton. */
private ArangoDB arangoDB;
private String databaseName = "ycsb";
private String collectionName;
private Boolean dropDBBeforeRun;
private Boolean waitForSync = false;
private Boolean transactionUpdate = false;
/**
* Initialize any state for this DB. Called once per DB instance; there is
* one DB instance per client thread.
*
* Actually, one client process will share one DB instance here.(Coincide to
* mongoDB driver)
*/
@Override
public void init() throws DBException {
synchronized (ArangoDB3Client.class) {
Properties props = getProperties();
collectionName = props.getProperty("table", "usertable");
// Set the DB address
String ip = props.getProperty("arangodb.ip", "localhost");
String portStr = props.getProperty("arangodb.port", "8529");
int port = Integer.parseInt(portStr);
// If clear db before run
String dropDBBeforeRunStr = props.getProperty("arangodb.dropDBBeforeRun", "false");
dropDBBeforeRun = Boolean.parseBoolean(dropDBBeforeRunStr);
// Set the sync mode
String waitForSyncStr = props.getProperty("arangodb.waitForSync", "false");
waitForSync = Boolean.parseBoolean(waitForSyncStr);
// Set if transaction for update
String transactionUpdateStr = props.getProperty("arangodb.transactionUpdate", "false");
transactionUpdate = Boolean.parseBoolean(transactionUpdateStr);
// Init ArangoDB connection
try {
arangoDB = new ArangoDB.Builder().host(ip).port(port).build();
} catch (Exception e) {
logger.error("Failed to initialize ArangoDB", e);
System.exit(-1);
}
if(INIT_COUNT.getAndIncrement() == 0) {
// Init the database
if (dropDBBeforeRun) {
// Try delete first
try {
arangoDB.db(databaseName).drop();
} catch (ArangoDBException e) {
logger.info("Fail to delete DB: {}", databaseName);
}
}
try {
arangoDB.createDatabase(databaseName);
logger.info("Database created: " + databaseName);
} catch (ArangoDBException e) {
logger.error("Failed to create database: {} with ex: {}", databaseName, e.toString());
}
try {
arangoDB.db(databaseName).createCollection(collectionName);
logger.info("Collection created: " + collectionName);
} catch (ArangoDBException e) {
logger.error("Failed to create collection: {} with ex: {}", collectionName, e.toString());
}
logger.info("ArangoDB client connection created to {}:{}", ip, port);
// Log the configuration
logger.info("Arango Configuration: dropDBBeforeRun: {}; address: {}:{}; databaseName: {};"
+ " waitForSync: {}; transactionUpdate: {};",
dropDBBeforeRun, ip, port, databaseName, waitForSync, transactionUpdate);
}
}
}
/**
* Cleanup any state for this DB. Called once per DB instance; there is one
* DB instance per client thread.
*
* Actually, one client process will share one DB instance here.(Coincide to
* mongoDB driver)
*/
@Override
public void cleanup() throws DBException {
if (INIT_COUNT.decrementAndGet() == 0) {
arangoDB.shutdown();
arangoDB = null;
logger.info("Local cleaned up.");
}
}
/**
* Insert a record in the database. Any field/value pairs in the specified
* values HashMap will be written into the record with the specified record
* key.
*
* @param table
* The name of the table
* @param key
* The record key of the record to insert.
* @param values
* A HashMap of field/value pairs to insert in the record
* @return Zero on success, a non-zero error code on error. See the
* {@link DB} class's description for a discussion of error codes.
*/
@Override
public Status insert(String table, String key, Map values) {
try {
BaseDocument toInsert = new BaseDocument(key);
for (Map.Entry entry : values.entrySet()) {
toInsert.addAttribute(entry.getKey(), byteIteratorToString(entry.getValue()));
}
DocumentCreateOptions options = new DocumentCreateOptions().waitForSync(waitForSync);
arangoDB.db(databaseName).collection(table).insertDocument(toInsert, options);
return Status.OK;
} catch (ArangoDBException e) {
logger.error("Exception while trying insert {} {} with ex {}", table, key, e.toString());
}
return Status.ERROR;
}
/**
* Read a record from the database. Each field/value pair from the result
* will be stored in a HashMap.
*
* @param table
* The name of the table
* @param key
* The record key of the record to read.
* @param fields
* The list of fields to read, or null for all of them
* @param result
* A HashMap of field/value pairs for the result
* @return Zero on success, a non-zero error code on error or "not found".
*/
@Override
public Status read(String table, String key, Set fields, Map result) {
try {
VPackSlice document = arangoDB.db(databaseName).collection(table).getDocument(key, VPackSlice.class, null);
if (!this.fillMap(result, document, fields)) {
return Status.ERROR;
}
return Status.OK;
} catch (ArangoDBException e) {
logger.error("Exception while trying read {} {} with ex {}", table, key, e.toString());
}
return Status.ERROR;
}
/**
* Update a record in the database. Any field/value pairs in the specified
* values HashMap will be written into the record with the specified record
* key, overwriting any existing values with the same field name.
*
* @param table
* The name of the table
* @param key
* The record key of the record to write.
* @param values
* A HashMap of field/value pairs to update in the record
* @return Zero on success, a non-zero error code on error. See this class's
* description for a discussion of error codes.
*/
@Override
public Status update(String table, String key, Map values) {
try {
if (!transactionUpdate) {
BaseDocument updateDoc = new BaseDocument();
for (Entry field : values.entrySet()) {
updateDoc.addAttribute(field.getKey(), byteIteratorToString(field.getValue()));
}
arangoDB.db(databaseName).collection(table).updateDocument(key, updateDoc);
return Status.OK;
} else {
// id for documentHandle
String transactionAction = "function (id) {"
// use internal database functions
+ "var db = require('internal').db;"
// collection.update(document, data, overwrite, keepNull, waitForSync)
+ String.format("db._update(id, %s, true, false, %s);}",
mapToJson(values), Boolean.toString(waitForSync).toLowerCase());
TransactionOptions options = new TransactionOptions();
options.writeCollections(table);
options.params(createDocumentHandle(table, key));
arangoDB.db(databaseName).transaction(transactionAction, Void.class, options);
return Status.OK;
}
} catch (ArangoDBException e) {
logger.error("Exception while trying update {} {} with ex {}", table, key, e.toString());
}
return Status.ERROR;
}
/**
* Delete a record from the database.
*
* @param table
* The name of the table
* @param key
* The record key of the record to delete.
* @return Zero on success, a non-zero error code on error. See the
* {@link DB} class's description for a discussion of error codes.
*/
@Override
public Status delete(String table, String key) {
try {
arangoDB.db(databaseName).collection(table).deleteDocument(key);
return Status.OK;
} catch (ArangoDBException e) {
logger.error("Exception while trying delete {} {} with ex {}", table, key, e.toString());
}
return Status.ERROR;
}
/**
* Perform a range scan for a set of records in the database. Each
* field/value pair from the result will be stored in a HashMap.
*
* @param table
* The name of the table
* @param startkey
* The record key of the first record to read.
* @param recordcount
* The number of records to read
* @param fields
* The list of fields to read, or null for all of them
* @param result
* A Vector of HashMaps, where each HashMap is a set field/value
* pairs for one record
* @return Zero on success, a non-zero error code on error. See the
* {@link DB} class's description for a discussion of error codes.
*/
@Override
public Status scan(String table, String startkey, int recordcount, Set fields,
Vector> result) {
ArangoCursor cursor = null;
try {
String aqlQuery = String.format(
"FOR target IN %s FILTER target._key >= @key SORT target._key ASC LIMIT %d RETURN %s ", table,
recordcount, constructReturnForAQL(fields, "target"));
Map bindVars = new MapBuilder().put("key", startkey).get();
cursor = arangoDB.db(databaseName).query(aqlQuery, bindVars, null, VPackSlice.class);
while (cursor.hasNext()) {
VPackSlice aDocument = cursor.next();
HashMap aMap = new HashMap(aDocument.size());
if (!this.fillMap(aMap, aDocument)) {
return Status.ERROR;
}
result.add(aMap);
}
return Status.OK;
} catch (Exception e) {
logger.error("Exception while trying scan {} {} {} with ex {}", table, startkey, recordcount, e.toString());
} finally {
if (cursor != null) {
try {
cursor.close();
} catch (IOException e) {
logger.error("Fail to close cursor", e);
}
}
}
return Status.ERROR;
}
private String createDocumentHandle(String collection, String documentKey) throws ArangoDBException {
validateCollectionName(collection);
return collection + "/" + documentKey;
}
private void validateCollectionName(String name) throws ArangoDBException {
if (name.indexOf('/') != -1) {
throw new ArangoDBException("does not allow '/' in name.");
}
}
private String constructReturnForAQL(Set fields, String targetName) {
// Construct the AQL query string.
String resultDes = targetName;
if (fields != null && fields.size() != 0) {
StringBuilder builder = new StringBuilder("{");
for (String field : fields) {
builder.append(String.format("\n\"%s\" : %s.%s,", field, targetName, field));
}
//Replace last ',' to newline.
builder.setCharAt(builder.length() - 1, '\n');
builder.append("}");
resultDes = builder.toString();
}
return resultDes;
}
private boolean fillMap(Map resultMap, VPackSlice document) {
return fillMap(resultMap, document, null);
}
/**
* Fills the map with the properties from the BaseDocument.
*
* @param resultMap
* The map to fill/
* @param document
* The record to read from
* @param fields
* The list of fields to read, or null for all of them
* @return isSuccess
*/
private boolean fillMap(Map resultMap, VPackSlice document, Set fields) {
if (fields == null || fields.size() == 0) {
for (Iterator> iterator = document.objectIterator(); iterator.hasNext();) {
Entry next = iterator.next();
VPackSlice value = next.getValue();
if (value.isString()) {
resultMap.put(next.getKey(), stringToByteIterator(value.getAsString()));
} else if (!value.isCustom()) {
logger.error("Error! Not the format expected! Actually is {}",
value.getClass().getName());
return false;
}
}
} else {
for (String field : fields) {
VPackSlice value = document.get(field);
if (value.isString()) {
resultMap.put(field, stringToByteIterator(value.getAsString()));
} else if (!value.isCustom()) {
logger.error("Error! Not the format expected! Actually is {}",
value.getClass().getName());
return false;
}
}
}
return true;
}
private String byteIteratorToString(ByteIterator byteIter) {
return new String(byteIter.toArray());
}
private ByteIterator stringToByteIterator(String content) {
return new StringByteIterator(content);
}
private String mapToJson(Map values) {
VPackBuilder builder = new VPackBuilder().add(ValueType.OBJECT);
for (Map.Entry entry : values.entrySet()) {
builder.add(entry.getKey(), byteIteratorToString(entry.getValue()));
}
builder.close();
return arangoDB.util().deserialize(builder.slice(), String.class);
}
}
================================================
FILE: arangodb3/src/main/java/com/yahoo/ycsb/db/arangodb/package-info.java
================================================
/**
* Copyright (c) 2017 YCSB contributors. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License"); you
* may not use this file except in compliance with the License. You
* may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
* implied. See the License for the specific language governing
* permissions and limitations under the License. See accompanying
* LICENSE file.
*/
/**
* The YCSB binding for ArangoDB.
*/
package com.yahoo.ycsb.db.arangodb;
================================================
FILE: asynchbase/README.md
================================================
# AsyncHBase Driver for YCSB
This driver provides a YCSB workload binding for Apache HBase using an alternative to the included HBase client. AsyncHBase is completely asynchronous for all operations and is particularly useful for write heavy workloads. Note that it supports a subset of the HBase client APIs but supports all public released versions of HBase.
## Quickstart
### 1. Setup Hbase
Follow directions 1 to 3 from ``hbase098``'s readme.
### 2. Load a Workload
Switch to the root of the YCSB repo and choose the workload you want to run and `load` it first. With the CLI you must provide the column family at a minimum if HBase is running on localhost. Otherwise you must provide connection properties via CLI or the path to a config file. Additional configuration parameters are available below.
```
bin/ycsb load asynchbase -p columnfamily=cf -P workloads/workloada
```
The `load` step only executes inserts into the datastore. After loading data, run the same workload to mix reads with writes.
```
bin/ycsb run asynchbase -p columnfamily=cf -P workloads/workloada
```
## Configuration Options
The following options can be configured using CLI (using the `-p` parameter) or via a JAVA style properties configuration file.. Check the [AsyncHBase Configuration](http://opentsdb.github.io/asynchbase/docs/build/html/configuration.html) project for additional tuning parameters.
* `columnfamily`: (Required) The column family to target.
* `config`: Optional full path to a configuration file with AsyncHBase options.
* `hbase.zookeeper.quorum`: Zookeeper quorum list.
* `hbase.zookeeper.znode.parent`: Path used by HBase in Zookeeper. Default is "/hbase".
* `debug`: If true, prints debug information to standard out. The default is false.
* `clientbuffering`: Whether or not to use client side buffering and batching of write operations. This can significantly improve performance and defaults to true.
* `durable`: When set to false, writes and deletes bypass the WAL for quicker responses. Default is true.
* `jointimeout`: A timeout value, in milliseconds, for waiting on operations synchronously before an error is thrown.
* `prefetchmeta`: Whether or not to read meta for all regions in the table and connect to the proper region servers before starting operations. Defaults to false.
Note: This module includes some Google Guava source files from version 12 that were later removed but are still required by HBase's test modules for setting up the mini cluster during integration testing.
================================================
FILE: asynchbase/pom.xml
================================================
4.0.0com.yahoo.ycsbbinding-parent0.14.0-SNAPSHOT../binding-parent/asynchbase-bindingAsyncHBase Client Binding for Apache HBasetrueorg.hbaseasynchbase${asynchbase.version}com.yahoo.ycsbcore${project.version}providedorg.apache.zookeeperzookeeper3.4.5log4jlog4jorg.slf4jslf4j-log4j12jlinejlinejunitjunitorg.jboss.nettynettyjunitjunit4.12testorg.apache.hbasehbase-testing-util${hbase10.version}testjdk.toolsjdk.toolsorg.apache.hbasehbase-client${hbase10.version}testjdk.toolsjdk.toolslog4jlog4j1.2.17testorg.slf4jlog4j-over-slf4j1.7.7testorg.apache.maven.pluginsmaven-surefire-plugin2.20-Xms4096m -Xms4096m
================================================
FILE: asynchbase/src/main/java/com/yahoo/ycsb/db/AsyncHBaseClient.java
================================================
/**
* Copyright (c) 2016 YCSB contributors. All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License"); you
* may not use this file except in compliance with the License. You
* may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
* implied. See the License for the specific language governing
* permissions and limitations under the License. See accompanying
* LICENSE file.
*/
package com.yahoo.ycsb.db;
import java.io.IOException;
import java.nio.charset.Charset;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;
import java.util.Map.Entry;
import java.util.Set;
import java.util.Vector;
import org.hbase.async.Bytes;
import org.hbase.async.Config;
import org.hbase.async.DeleteRequest;
import org.hbase.async.GetRequest;
import org.hbase.async.HBaseClient;
import org.hbase.async.KeyValue;
import org.hbase.async.PutRequest;
import org.hbase.async.Scanner;
import com.yahoo.ycsb.ByteArrayByteIterator;
import com.yahoo.ycsb.ByteIterator;
import com.yahoo.ycsb.DBException;
import com.yahoo.ycsb.Status;
import static com.yahoo.ycsb.workloads.CoreWorkload.TABLENAME_PROPERTY;
import static com.yahoo.ycsb.workloads.CoreWorkload.TABLENAME_PROPERTY_DEFAULT;
/**
* Alternative Java client for Apache HBase.
*
* This client provides a subset of the main HBase client and uses a completely
* asynchronous pipeline for all calls. It is particularly useful for write heavy
* workloads. It is also compatible with all production versions of HBase.
*/
public class AsyncHBaseClient extends com.yahoo.ycsb.DB {
public static final Charset UTF8_CHARSET = Charset.forName("UTF8");
private static final String CLIENT_SIDE_BUFFERING_PROPERTY = "clientbuffering";
private static final String DURABILITY_PROPERTY = "durability";
private static final String PREFETCH_META_PROPERTY = "prefetchmeta";
private static final String CONFIG_PROPERTY = "config";
private static final String COLUMN_FAMILY_PROPERTY = "columnfamily";
private static final String JOIN_TIMEOUT_PROPERTY = "jointimeout";
private static final String JOIN_TIMEOUT_PROPERTY_DEFAULT = "30000";
/** Mutex for instantiating a single instance of the client. */
private static final Object MUTEX = new Object();
/** Use for tracking running thread counts so we know when to shutdown the client. */
private static int threadCount = 0;
/** The client that's used for all threads. */
private static HBaseClient client;
/** Print debug information to standard out. */
private boolean debug = false;
/** The column family use for the workload. */
private byte[] columnFamilyBytes;
/** Cache for the last table name/ID to avoid byte conversions. */
private String lastTable = "";
private byte[] lastTableBytes;
private long joinTimeout;
/** Whether or not to bypass the WAL for puts and deletes. */
private boolean durability = true;
/**
* If true, buffer mutations on the client. This is the default behavior for
* AsyncHBase. For measuring insert/update/delete latencies, client side
* buffering should be disabled.
*
* A single instance of this
*/
private boolean clientSideBuffering = false;
@Override
public void init() throws DBException {
if (getProperties().getProperty(CLIENT_SIDE_BUFFERING_PROPERTY, "false")
.toLowerCase().equals("true")) {
clientSideBuffering = true;
}
if (getProperties().getProperty(DURABILITY_PROPERTY, "true")
.toLowerCase().equals("false")) {
durability = false;
}
final String columnFamily = getProperties().getProperty(COLUMN_FAMILY_PROPERTY);
if (columnFamily == null || columnFamily.isEmpty()) {
System.err.println("Error, must specify a columnfamily for HBase table");
throw new DBException("No columnfamily specified");
}
columnFamilyBytes = columnFamily.getBytes();
if ((getProperties().getProperty("debug") != null)
&& (getProperties().getProperty("debug").compareTo("true") == 0)) {
debug = true;
}
joinTimeout = Integer.parseInt(getProperties().getProperty(
JOIN_TIMEOUT_PROPERTY, JOIN_TIMEOUT_PROPERTY_DEFAULT));
final boolean prefetchMeta = getProperties()
.getProperty(PREFETCH_META_PROPERTY, "false")
.toLowerCase().equals("true") ? true : false;
try {
synchronized (MUTEX) {
++threadCount;
if (client == null) {
final String configPath = getProperties().getProperty(CONFIG_PROPERTY);
final Config config;
if (configPath == null || configPath.isEmpty()) {
config = new Config();
final Iterator> iterator = getProperties()
.entrySet().iterator();
while (iterator.hasNext()) {
final Entry