Pinned repositories
-
OryxProject/oryx
Oryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large scale machine learning
962 contributions in the last year
Contribution activity
December 2016
Created a pull request in apache/spark
that received 9
comments
[SPARK-18678][ML] Skewed reservoir sampling in SamplingUtils
What changes were proposed in this pull request? Fix reservoir sampling bias for small k. An off-by-one error meant that the probability of replace…
- [SPARK-18929][ML] Add Tweedie distribution in GLM
- [SPARK-19002][BUILD] Check pep8 against dev/*.py scripts
- [SPARK-19003][DOCS] Add Java example in Spark Streaming Guide, section Design Patterns for using foreachRDD
- [SPARK-19007]Speedup and optimize the GradientBoostedTrees
- [SPARK-18819][CORE] Double byte alignment on ARM platforms
- [SPARK-19010][CORE] Include Kryo exception in case of overflow
- [SPARK-18922][TESTS] Fix more path-related test failures on Windows
- [SPARK-18991][Core]Change ContextCleaner.referenceBuffer to use ConcurrentHashMap to make it faster
- [SPARK-18837][WEBUI] Very long stage descriptions do not wrap in the UI
- [SPARK-16992][PYSPARK][DOCS] import sort and autopep8 on Pyspark examples
- [SPARK-18963] o.a.s.unsafe.types.UTF8StringSuite.writeToOutputStreamIntArray test
- [SPARK-18036][ML][MLLIB] Fixing decision trees handling edge cases
- [SPARK-18972][Core]Fix the netty thread names for RPC
- [SPARK-18960][SQL][SS] Avoid double reading file which is being copied.
- [SPARK-16473][MLLIB] Fix BisectingKMeans Algorithm failing in edge case
- [SPARK-18687][Pyspark][SQL]Backward compatibility - creating a Dataframe on a new SQLContext object fails with a Derby error
- [SPARK-18951] Upgrade com.thoughtworks.paranamer/paranamer to 2.6
- [SPARK-18923][DOC][BUILD] Support skipping R/Python API docs
- [DOC][BUILD][MINOR] add doc on new make-distribution switches
- [SPARK-18953][CORE][WEB UI] Do now show the link to a dead worker on the master page
- [BUILD] make-distribution should find JAVA_HOME for non-RHEL systems
- [SPARK-18903][SPARKR] Add API to get SparkUI URL
- [SPARK-17455][MLlib] Improve PAVA implementation in IsotonicRegression
- [SPARK-18922][TESTS] Fix more resource-closing-related and path-related test failures in identified ones on Windows
- [SPARK-18808][ML][MLLIB] ml.KMeansModel.transform is very inefficient
- [SPARK-18485][CORE] Underlying integer overflow when create ChunkedByteBufferOutputStream in MemoryStore
- [SPARK-18723][DOC] Expanded programming guide information on wholeTex…
- [SPARK-18708][CORE] Improvement/improve docs in spark context file
- [SPARK-18895][TESTS] Fix resource-closing-related and path-related test failures in identified ones on Windows
- [MINOR][BUILD] Fix lint-check failures and javadoc8 break
- [WIP][SPARK-18896][TESTS] Update to ScalaTest 3.0.1
- [SPARK-18845][GraphX] PageRank has incorrect initialization value that leads to slow convergence
- [SPARK-18836] [CORE] Serialize one copy of task metrics in DAGScheduler
- [SPARK-18356] [ML] KMeans should cache RDD before training
- [SPARK-18827][Core] Fix cannot read broadcast on disk
- [SPARK-18855][CORE] Add RDD flatten function
- [SPARK-18840][YARN] Avoid throw exception when getting token renewal interval in non HDFS security environment
- [SPARK-18471][MLLIB] In LBFGS, avoid sending huge vectors of 0
- [SPARK-18842][TESTS][LAUNCHER] De-duplicate paths in classpaths in commands for local-cluster mode to work around the path length limitation on Windows
- [SPARK-18830][TESTS] Fix tests in PipedRDDSuite to pass on Windows
- [SPARK-18816] [Web UI] Executors Logs column only ran visibility check on initial table load
- [SPARK-18767][ML] Unify Models' toString methods
- [SPARK-18835][sql] Don't expose Guava types in the JavaTypeInference API.
- [SPARK-18715][ML]Fix AIC calculations in Binomial GLM
- [MINOR][CORE][SQL] Remove explicit RDD and Partition overrides
- [CORE][MINOR] Stylistic changes in DAGScheduler (to ease comprehensio…
- [SQL][minor] simplify a test to fix the maven tests
- [SPARK-18812] [MLLIB] explain "Spark ML"
- [SPARK-18628][ML] Update Scala param and Python param to have quotes
- [MINOR][DOCS] Remove Apache Spark Wiki address
- [SPARK-18745][SQL] Fix signed integer overflow due to toInt cast
- [SPARK-18803][TESTS] Fix JarEntry-related & path-related test failures and skip some tests by path length limitation on Windows
- [SPARK-18697][BUILD] Upgrade sbt plugins
- [DOCS][MINOR] Clarify Where AccumulatorV2s are Displayed
- [SPARK-18231] Optimise SizeEstimator implementation
- [SPARK-18620][Streaming][Kinesis] Flatten input rates in timeline for streaming + kinesis
- [Spark-15155][Mesos] Optionally ignore default role resources
- [SPARK-18744][Core]Remove workaround for Netty memory leak
- [SPARK-18741][STREAMING] Reuse or clean-up SparkContext in streaming tests
- [SPARK-18652][PYTHON] Include the example data and third-party licenses in pyspark package.
- [SPARK-18555][SQL]DataFrameNaFunctions.fill miss up original values in long integers
- [SPARK-18718][TESTS] Skip some test failures due to path length limitation and fix tests to pass on Windows
- [SPARK-18701][ML] Fix Poisson GLM failure due to wrong initialization
- [SPARK-18717][SQL] Make code generation for Scala Map work with immutable.Map also
- [SPARK-18697][BUILD] Upgrade sbt plugins
- [SPARK-18374][ML]Incorrect words in StopWords/english.txt
- [MINOR][CORE][SQL][DOCS] Typo fixes
- [SPARK-18719] Add spark.ui.showConsoleProgress to configuration docs
- [MINOR] [README] Correct Markdown link inside readme
- [SPARK-18606][HISTORYSERVER]remove useless elements while searching
- [SPARK-18672][CORE] Close recordwriter in SparkHadoopMapReduceWriter before committing
- [SPARK-18685][TESTS] Fix URI and release resources after opening in tests at ExecutorClassLoaderSuite
- [SPARK-18653][SQL] Fix incorrect space padding for unicode character at Dataset.show
- [SPARK-11374][SQL] Support `skip.header.line.count` option for Hive Table
- [SPARK-18666][Web UI] Remove the codes checking deprecated config spark.sql.unsafe.enabled
- [SPARK-18638][BUILD] Upgrade sbt, Zinc, and Maven plugins